<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Appl. Math. Stat.</journal-id>
<journal-title>Frontiers in Applied Mathematics and Statistics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Appl. Math. Stat.</abbrev-journal-title>
<issn pub-type="epub">2297-4687</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fams.2023.1122114</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Applied Mathematics and Statistics</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>GPMatch: A Bayesian causal inference approach using Gaussian process covariance function as a matching tool</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Huang</surname> <given-names>Bin</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1921892/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Chen</surname> <given-names>Chen</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2089022/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Liu</surname> <given-names>Jinzhong</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Sivaganisan</surname> <given-names>Siva</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2200746/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Division of Biostatistics and Epidemiology, Cincinnati Children&#x00027;s Hospital Medical Center</institution>, <addr-line>Cincinnati, OH</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Pediatrics, University of Cincinnati College of Medicine</institution>, <addr-line>Cincinnati, OH</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Regeneron Pharmaceuticals</institution>, <addr-line>Basking Ridge, NJ</addr-line>, <country>United States</country></aff>
<aff id="aff4"><sup>4</sup><institution>Department of Mathematical Sciences, University of Cincinnati</institution>, <addr-line>Cincinnati, OH</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Zexun Chen, University of Edinburgh, United Kingdom</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Rong Pan, Arizona State University, United States; Jeremy Gaskins, University of Louisville, United States</p></fn>

<corresp id="c001">&#x0002A;Correspondence: Bin Huang <email>bin.huang&#x00040;cchmc.org</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Mathematics of Computation and Data Science, a section of the journal Frontiers in Applied Mathematics and Statistics</p></fn></author-notes>
<pub-date pub-type="epub">
<day>08</day>
<month>03</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>9</volume>
<elocation-id>1122114</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>12</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>09</day>
<month>02</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2023 Huang, Chen, Liu and Sivaganisan.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Huang, Chen, Liu and Sivaganisan</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>A Gaussian process (GP) covariance function is proposed as a matching tool for causal inference within a full Bayesian framework under relatively weaker causal assumptions. We demonstrate that matching can be accomplished by utilizing GP prior covariance function to define matching distance. The matching properties of GPMatch is presented analytically under the setting of categorical covariates. Under the conditions of either (1) GP mean function is correctly specified; or (2) the GP covariance function is correctly specified, we suggest GPMatch possesses doubly robust properties asymptotically. Simulation studies were carried out without assuming any a priori knowledge of the functional forms of neither the outcome nor the treatment assignment. The results demonstrate that GPMatch enjoys well-calibrated frequentist properties and outperforms many widely used methods including Bayesian Additive Regression Trees. The case study compares the effectiveness of early aggressive use of biological medication in treating children with newly diagnosed Juvenile Idiopathic Arthritis, using data extracted from electronic medical records. Discussions and future directions are presented.</p></abstract>
<kwd-group>
<kwd>causal inference</kwd>
<kwd>matching</kwd>
<kwd>doubly robust (DR) estimator</kwd>
<kwd>marginal structural model</kwd>
<kwd>G-estimation</kwd>
<kwd>real world data (RWD)</kwd>
</kwd-group>
<contract-num rid="cn001">ME-1408-19894</contract-num>
<contract-num rid="cn002">5UL1TR001425-03</contract-num>
<contract-sponsor id="cn001">Patient-Centered Outcomes Research Institute<named-content content-type="fundref-id">10.13039/100006093</named-content></contract-sponsor>
<contract-sponsor id="cn002">National Center for Advancing Translational Sciences<named-content content-type="fundref-id">10.13039/100006108</named-content></contract-sponsor>
<counts>
<fig-count count="7"/>
<table-count count="5"/>
<equation-count count="29"/>
<ref-count count="58"/>
<page-count count="19"/>
<word-count count="12186"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Data from nonrandomized experiments, such as registries and electronic records, are becoming indispensable sources for answering causal inference questions in health, social, political, economics, and many other disciplines. Under the assumptions of no unmeasured confounders, ignorable treatment assignment, and distinct model parameters governing the science and treatment assignment mechanisms, Rubin [<xref ref-type="bibr" rid="B1">1</xref>] suggested Bayesian approach to the estimation of causal treatment effect can be accomplished by directly modeling the outcomes, treating it as a missing potential outcome problem. Direct modeling is able to utilize the many Bayesian regression modeling techniques to address complex data types and data structures, such as examples in Hirano et al. [<xref ref-type="bibr" rid="B2">2</xref>], Zajonc [<xref ref-type="bibr" rid="B3">3</xref>], Imbens and Rubin [<xref ref-type="bibr" rid="B4">4</xref>], and Baccini et al. [<xref ref-type="bibr" rid="B5">5</xref>]. Recent work further suggested that outcome regression-based estimation should be asymptotically more efficient than any inverse probability weighting-based estimation [<xref ref-type="bibr" rid="B6">6</xref>].</p>
<p>Parameter-rich Bayesian modeling techniques are particularly appealing as they do not presume a known functional form, and thus may help mitigate potential model misspecification issues. Hill [<xref ref-type="bibr" rid="B7">7</xref>] suggested Bayesian additive regression tree (BART) can be used for causal inference, and showed it produced more accurate estimates of average treatment effects compared to propensity score matching, inverse propensity weighted estimators, and regression adjustment in the nonlinear setting, and it performed as well under the linear setting. Others have used Gaussian Process in conjunction with Dirichlet Process priors, e.g., Roy et al. [<xref ref-type="bibr" rid="B8">8</xref>] and Xu et al. [<xref ref-type="bibr" rid="B9">9</xref>]. Roy et al. [<xref ref-type="bibr" rid="B10">10</xref>] devised enriched Dirichlet Process priors tackling missing covariate issues. However, naive use of regression techniques could lead to substantial bias in estimating causal effect as demonstrated in Hahn et al. [<xref ref-type="bibr" rid="B11">11</xref>].</p>
<p>The search for ways of incorporating propensity of treatment selection into the Bayesian causal inference has been long-standing. Including propensity score (PS) as a covariate in the outcome model may be a natural way. However, joint modeling of outcome and treatment selection models leads to a &#x0201C;feedback&#x0201D; issue. A two-staged approach was suggested by McCandless et al. [<xref ref-type="bibr" rid="B12">12</xref>], Zigler et al. [<xref ref-type="bibr" rid="B13">13</xref>], and others. Whether the uncertainty of the first step propensity score modeling should be taken into account when obtaining the final result in the second step remain a point of discussion [<xref ref-type="bibr" rid="B14">14</xref>&#x02013;<xref ref-type="bibr" rid="B17">17</xref>]. Saarela et al. [<xref ref-type="bibr" rid="B18">18</xref>] proposed an approximate Bayesian approach incorporating inverse probability treatment assignment probabilities as importance-sampling weights in Monte Carlo integration. It offers a Bayesian version of the augmented inverse probability treatment weighting (AIPTW). Hahn et al. [<xref ref-type="bibr" rid="B19">19</xref>] suggested incorporating estimated treatment propensity into the regression to explicitly induce covariate dependent prior in the regression model. These methods all require a separate step of treatment propensity modeling, which may suffer if the propensity model is misspecified.</p>
<p>Matching is one of the most sought-after methods used for the design and analyzes of observational studies for answering causal questions. Matching experimental units on their pre-treatment assignment characteristics helps to remove the bias by ensuring the similarity or balance between the experimental units of the two treatment groups. Matching methods impute the missing potential outcome with the value from the nearest match or the weighted average of the values within the nearby neighborhood defined by (a chosen value) caliper. Matching on multiple covariates could be challenging when the dimensions of the covariates are large. For this reason, matching is often performed using the estimated propensity score (PS) or by the Mahalanobis distance (MD). The idea is, under the no unmeasured confounder setting, matching induces a balance between the treated and untreated groups. Therefore, it serves to transform a nonrandomized study into a pseudo-randomized study. There are many different matching techniques, a comprehensive review is provided in Stuart [<xref ref-type="bibr" rid="B20">20</xref>]. A recent study by King and Nielsen [<xref ref-type="bibr" rid="B21">21</xref>] compared PS matching with MD matching and suggests that PS matching can result in a more biased and less accurate estimate of averaged causal treatment as the precision of matching improves, while MD matching is showing improved accuracy. Common to matching methods, the data points without a match are discarded. Such a practice may lead to a sample no longer representative of the target population. A user-specified caliper is often required, but different calipers could lead to very different results. Furthermore, matching on a miss-specified PS could lead to invalid causal inference results.</p>
<p>A combination of matching and regression is a better approach than using either of them alone [<xref ref-type="bibr" rid="B22">22</xref>]. Ho et al. [<xref ref-type="bibr" rid="B15">15</xref>] advocated matching as nonparametric preprocessing for reducing dependence on parametric modeling assumptions. Gutman and Rubin [<xref ref-type="bibr" rid="B23">23</xref>] examined different strategies of combining the preprocessed matching with a regression modeling of the outcome through extensive simulation studies. They demonstrated that some commonly used causal inference methods have poor operating characteristics, and consider ways to correct for variance estimate for causal treatment effect obtained from regression modeling after preprocessed matching. To our knowledge, no existing method can accomplish matching and regression modeling in a single step.</p>
<p>Gaussian process (GP) prior has been widely used to describe biological, social, financial, and physical phenomena, due to its ability to model highly complex dynamic systems and its many desirable mathematical properties. Recent literature, e.g., Choi and Woo [<xref ref-type="bibr" rid="B24">24</xref>] and Choi and Schervish [<xref ref-type="bibr" rid="B25">25</xref>], has established posterior consistency for Bayesian partially linear GP regression models. Bayesian modeling with GP prior can be viewed as a marginal structural model where the treatment effect is modeled as a linear function of background variables. It predicts the missing response by a weighted sum of observed data, with larger weights assigned to those in closer proximity but smaller to those further away, much like a matching procedure. This motivated us to consider using GP prior covariance function as a matching tool for Bayesian causal inference.</p>
<p>The idea of utilizing GP prior in a Bayesian approach to causal inference is not new. Examples can be found in Roy et al. [<xref ref-type="bibr" rid="B8">8</xref>] for addressing heterogeneous treatment effect, in Xu et al. [<xref ref-type="bibr" rid="B9">9</xref>] for handling dynamic treatment assignment, and in Roy et al. [<xref ref-type="bibr" rid="B10">10</xref>] for tackling missing data. While these studies demonstrated GP prior could be used to achieve flexible modeling and tackle complex settings, no one has considered GP prior a matching tool. This study adds to the literature in several ways. First, we offer a principled approach to Bayesian causal inference utilizing GP prior covariance function as a matching tool, which accomplishes matching and flexible outcome modeling in a single step. Second, we provide relaxed causal assumptions than the widely adopted assumptions from the landmark paper by Rosenbaum and Rubin [<xref ref-type="bibr" rid="B26">26</xref>]. By admitting additional random noise in outcome measures, these new assumptions fit more naturally within the Bayesian framework. Under these weaker causal assumptions, the GPMatch method offers a doubly robust approach in the sense that the averaged causal treatment effect is correctly estimated when either one of the conditions is met: (1) when the mean function correctly specifies the prognostic function of outcome; or (2) the covariance function matrix correctly specifies the treatment propensity.</p>
<p>The rest of the presentation is organized as follows. Section 2 describes methods, where we present problem setup, causal assumptions, and the model specifications. The utility of the GP covariance function as a matching tool is presented in Section 3, followed by discussions of its double robustness property. Simulation studies are presented in Section 4. Simulations are designed to represent the real-world setting where the true functional form is unknown, including the well-known simulation design suggested by Kang and Schafer [<xref ref-type="bibr" rid="B27">27</xref>]. We compared the GPMatch approach with some commonly used causal inference methods, i.e., linear regression with PS adjustment, inverse probability treatment weighting, and BART, without assuming any knowledge of the true data-generating models. The results demonstrate that the GPMatch enjoys well-calibrated frequentist properties, and outperforms many widely used methods under the dual misspecification setting. Section 5 presents a case study, examining the comparative effectiveness of an early introduction of biological medication in treating children with recently diagnosed juvenile idiopathic arthritis (JIA). Section 6 presents the summary, discussions, and future directions.</p>
</sec>
<sec id="s2">
<title>2. Method</title>
<sec>
<title>2.1. Notations, problem setup, and parameters of interests</title>
<p>For the <italic>i</italic><sup><italic>th</italic></sup> sample unit, we observe <italic>D</italic><sub><italic>i</italic></sub> &#x0003D; (<italic>X</italic><sub><italic>i</italic></sub>, <italic>A</italic><sub><italic>i</italic></sub>, <italic>Y</italic><sub><italic>i</italic></sub>), <italic>i</italic> &#x0003D;, 1..., <italic>n</italic>, a random sample of a given study population. Denote the causal factor or &#x0201C;treatment&#x0201D; by <italic>A</italic><sub><italic>i</italic></sub>. For simplicity of exposition, here we consider <italic>A</italic><sub><italic>i</italic></sub> &#x0003D; 1/0. Let <italic>Y</italic><sub><italic>i</italic></sub> denote the observed outcomes, <italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub> the p-dimensional observed vector of background variable, which contains determinants of treatment assignment <italic>Pr</italic>(<italic>A</italic><sub><italic>i</italic></sub> &#x0003D; 1) &#x0003D; &#x003C0;(<italic><bold>x</bold></italic><sub><italic>i</italic></sub>) and the determinants of potential outcomes <inline-formula><mml:math id="M1"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">A</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula>. Given the background variables <italic><bold>X</bold></italic><sub><italic>i</italic></sub>, such as patient age, gender, genetic makeup, disease status, environmental exposures, and past treatment histories, the potential outcomes for a given patient are determined by the underlying science mechanisms <inline-formula><mml:math id="M2"><mml:mrow><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, and the treatment are assigned following <italic>A</italic><sub><italic>i</italic></sub>&#x0007E;<italic>Ber</italic>(&#x003C0;(<italic><bold>x</bold></italic><sub><italic><bold>i</bold></italic></sub>)).</p>
<p>Under the given treatment assignment, the observed outcome may be measured with error, i.e., a noisy version of the corresponding potential outcomes,</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic>E</italic>(&#x003F5;<sub><italic>i</italic></sub>) &#x0003D; 0. In other words, the observed outcome for the <italic>i</italic><sup><italic>th</italic></sup> individual is a realization of the joint actions between the science mechanisms and the treatment assignment. Any two sample units that share the same background features <italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub> <bold>&#x0003D;</bold> <italic><bold>X</bold></italic><sub><italic><bold>j</bold></italic></sub> &#x0003D; <italic><bold>x</bold></italic>, regardless of their treatment assignment, are expected to experience the same potential outcomes <inline-formula><mml:math id="M4"><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>.</p>
<p>Our goal is to estimate the averaged treatment effect for a given study population</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>A</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003C4;(<italic><bold>x</bold></italic>) &#x0003D; <italic>f</italic><sup>(1)</sup>(<italic><bold>x</bold></italic>)&#x02212;<italic>f</italic><sup>(0)</sup>(<italic><bold>x</bold></italic>).</p>
</sec>
<sec>
<title>2.2. The causal assumptions</title>
<p>To ensure identifiability of the causal treatment effect, we impose the following causal assumptions, which may be considered as a somewhat relaxed version of commonly adopted causal assumptions as suggested in Rosenbaum and Rubin [<xref ref-type="bibr" rid="B26">26</xref>]:</p>
<list list-type="simple">
<list-item><p><bold>CA1</bold>. Stable Unit Treatment Value Expectation Assumption (SUTVEA).</p></list-item>
</list> 
<list list-type="simple">
<list-item><p>(i) We consider the observed outcome may be a noisy version of the potential outcome where the expectation of the observed outcome is jointly determined by the underlying science mechanisms and the treatment assignment <inline-formula><mml:math id="M6"><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, for <italic>A</italic><sub><italic>i</italic></sub> &#x0003D; 0, 1.</p></list-item> 
<list-item><p>(ii) For the underlying science mechanism that generates potential outcomes, there exists a constant <italic>K</italic> &#x0003E; 0 such that |<italic>f</italic><sup>(<italic>a</italic>)</sup>| &#x02264; <italic>K</italic>, for <italic>a</italic> &#x0003D; 0, 1.</p></list-item>
</list>
<list list-type="simple">
<list-item><p><bold>CA2</bold>. Ignorable Treatment Assignment Assumption, or no unmeasured confounders assumption requires the treatment assignment is independent from the underlying science mechanism given the observed covariates, <inline-formula><mml:math id="M7"><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x022A5;</mml:mo><mml:msup><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>|</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mstyle></mml:mrow></mml:math></inline-formula> for <italic>a</italic> &#x0003D; 0, 1 .</p></list-item> 
<list-item><p><bold>CA3</bold>. Positivity Assumption. For every sample unit, there is a nonzero probability of being assigned to either one of the treatment arms, i.e., 0 &#x0003C; <italic>Pr</italic>(<italic>A</italic><sub><italic>i</italic></sub>|<italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub>) &#x0003C; 1.</p></list-item>
</list>
<p>The SUTVEA assumption represents a somewhat weaker assumption than SUTVA. It acknowledges the existence of residual random error in the outcome measure. The observed outcomes may differ from the corresponding true potential outcomes due to some measurement errors or account for random noise related to the treatment received. For example, outcomes could differ by recorders, the timing of the treatment, the pre-surgery preparation procedure, or the concomitant medication. In addition, we consider the potential outcomes from different experimental units may be correlated, where the correlations are determined by the covariates. Under the no unmeasured confounders assumption, we may model the correlation between two potential outcomes. Since only one of all potential outcomes could be observed, the causal inference presents a highly structured missing data setup where the correlations between <inline-formula><mml:math id="M8"><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> are not directly identifiable. Admitting residual random errors and allowing for explicit modeling of the covariance structure, the new assumptions may facilitate better statistics inference.</p>
</sec>
<sec>
<title>2.3. Model specifications</title>
<p>The marginal structural model (MSM) is a widely adopted modeling approach to causal inference, which serves as a natural framework for Bayesian causal inference. The MSM specifies</p>
<disp-formula id="E121"><mml:math id="M"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mi>Y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>+</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mi>&#x003C4;</mml:mi><mml:mo>*</mml:mo></mml:msup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>

<p>Without prior knowledge about the true functional form, we propose GPMatch as a partially linear Gaussian process regression fitting to the observed outcomes,</p>
<disp-formula id="E4"><label>(3)</label><mml:math id="M10"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>&#x003C4;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where</p>
<disp-formula id="E5"><mml:math id="M111"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>~</mml:mo><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mi>f</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>i</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>K</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>~</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mn>0</mml:mn><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x003F5;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x022A5;</mml:mo><mml:msub><mml:mi>&#x003B7;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Here, we may let <bold>&#x003BC;</bold><sub><italic><bold>f</bold></italic></sub> &#x0003D; ((1, <italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub><bold>)<italic>&#x003B2;</italic></bold>)<sub><italic>n</italic>&#x000D7;1</sub>, where <italic><bold>&#x003B2;</bold></italic> is a (1&#x0002B;<italic>p</italic>) dimension parameter vector of regression coefficients for the mean function. This is to allow for implementation of any existing knowledge about the prognostic determinants of the outcome. Also, let <bold>&#x003C4;(<italic>x</italic>)</bold> &#x0003D; ((1, <italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub>)<bold>&#x003B1;</bold>)<sub><italic>n</italic>&#x000D7;1</sub> to allow for potential heterogeneous treatment effect, where <bold>&#x003B1;</bold> is a (1&#x0002B;<italic>p</italic>) dimension parameter vector of regression coefficients for the treatment effect. For both <bold>&#x003BC;</bold><sub><italic><bold>f</bold></italic></sub> and <bold>&#x003C4;</bold>, <italic><bold>x</bold></italic><sub><italic><bold>i</bold></italic></sub> may include higher order terms, interactions, dummy and coarsening variations of the background variables.</p>
<p>Let <inline-formula><mml:math id="M12"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, the model (Equation 3) can be re-expressed in a multivariate representation</p>
<disp-formula id="E6"><label>(4)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>V</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>&#x003B3;</mml:mi></mml:mstyle><mml:mo>~</mml:mo><mml:mi>M</mml:mi><mml:mi>V</mml:mi><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Z</mml:mi><mml:mi>&#x003B3;</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <italic><bold>Z</bold></italic> &#x0003D; (1, <italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub>, <italic>A</italic><sub><italic>i</italic></sub>, <italic>A</italic><sub><italic>i</italic></sub>&#x000D7;<italic><bold>X</bold></italic><sub><italic><bold>i</bold></italic></sub>)<sub><italic>n</italic>&#x000D7;(2&#x0002B;2<italic>p</italic>)</sub>, <italic><bold>&#x003B3;</bold></italic> &#x0003D; (<italic><bold>&#x003B2;, &#x003B1;</bold></italic>)&#x02032;, <bold>&#x003A3;</bold> &#x0003D; (&#x003C3;<sub><italic>ij</italic></sub>)<sub><italic>n</italic>&#x000D7;<italic>n</italic></sub>, with <inline-formula><mml:math id="M14"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>K</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mo>,</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. The &#x003B4;<sub><italic>ij</italic></sub> is the Kronecker function, &#x003B4;<sub><italic>ij</italic></sub> &#x0003D; 1 if <italic>i</italic> &#x0003D; <italic>j</italic>, and 0 otherwise.</p>
<p>The Gaussian process can be considered as a distribution over function. The covariance function <italic><bold>K</bold></italic>, where <italic>k</italic><sub><italic>ij</italic></sub> &#x0003D; <italic>Cov</italic>(<bold>&#x003B7;<sub><italic>i</italic></sub></bold>, <bold>&#x003B7;</bold><bold><sub><italic>j</italic></sub></bold>), plays a critical role in GP regression. It can be used to reflect the prior belief about the functional form, determining its shape and degree of smoothness. Often, the exact matching structure is not available, a natural choice for the GP prior covariance function <italic><bold>K</bold></italic> is the squared-exponential (SE) function, where?</p>
<disp-formula id="E7"><label>(5)</label><mml:math id="M15"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>K</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>f</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mstyle displaystyle='true'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>p</mml:mi></mml:munderover><mml:mrow><mml:mfrac><mml:mrow><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>v</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mi>&#x003D5;</mml:mi><mml:mi>k</mml:mi></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:mstyle></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>for <italic>i, j</italic> &#x0003D; 1, ..., <italic>n</italic>. The (&#x003D5;<sub>1</sub>, &#x003D5;<sub>2</sub>, ..., &#x003D5;<sub><italic>p</italic></sub>) are the length scale parameters for each of the covariate variables.</p>
<p>There are several considerations in choosing the SE covariance function. The GP regression with SE covariance can be considered a Bayesian linear regression model with infinite basis functions, which is able to fit a smoothed response surface. Because of the GP&#x00027;s ability to choose the length-scale and covariance parameters using the training data, unlike other flexible models such as splines or the supporting vector machine (SVM), GP regression does not require cross-validation [<xref ref-type="bibr" rid="B28">28</xref>]. Moreover, the SE covariance function provides a distance metric that is similar to Mahalanobis distance, which has been frequently used as a matching tool.</p>
<p>The model specification is completed by a specification of the rest of the priors.</p>
<disp-formula id="E8"><mml:math id="M16"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>&#x003B3;</mml:mi></mml:mstyle><mml:mo>~</mml:mo><mml:mi>M</mml:mi><mml:mi>V</mml:mi><mml:mi>N</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>f</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mi>&#x003C9;</mml:mi><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mn>2</mml:mn></mml:msubsup><mml:msup><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Z</mml:mi><mml:msup><mml:mi>Z</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mstyle><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mn>0</mml:mn><mml:mn>2</mml:mn></mml:msubsup><mml:mo>~</mml:mo><mml:mi>I</mml:mi><mml:mi>G</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>f</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo>~</mml:mo><mml:mi>I</mml:mi><mml:mi>G</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi>f</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi>f</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x003D5;</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo>~</mml:mo><mml:mi>I</mml:mi><mml:mi>G</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mi>&#x003D5;</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>b</mml:mi><mml:mi>&#x003D5;</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>We set <inline-formula><mml:math id="M17"><mml:mrow><mml:mi>&#x003C9;</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>6</mml:mn></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> is the estimated variance from a simple linear regression model of <italic>Y</italic> on <italic>A</italic> and <italic>X</italic> for computational efficiency.</p>
<p>The posterior of the parameters can be obtained by implementing a Gibbs sampling algorithm: first sample the covariate function parameters from its posterior distribution [<bold>&#x003A3;</bold>|<italic>Data</italic>, <italic><bold>&#x003B1;, &#x003B2;</bold></italic>]; then sample the regression coefficient parameter associated with the mean function from its conditional posterior distribution [<italic><bold>&#x003B1;, &#x003B2;</bold></italic>|<italic>Data</italic>, <italic><bold>&#x003A3;</bold></italic>], which is a multivariate normal distribution. The individual level treatment effect can be estimated by <inline-formula><mml:math id="M18"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>&#x003B1;</mml:mi></mml:mstyle></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> and the averaged treatment effect is estimated by <inline-formula><mml:math id="M19"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>A</mml:mi><mml:mi>T</mml:mi><mml:mi>E</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:mfrac><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>. Further details are provided in the <xref ref-type="supplementary-material" rid="SM1">Supplementary material</xref>.</p>
</sec>
</sec>
<sec id="s3">
<title>3. Estimating averaged treatment effect</title>
<sec>
<title>3.1. GP covariance as a matching tool (GPMatch)</title>
<p>To demonstrate the utility of the GP covariance function as a matching tool, let us first consider a simple setting with a categorical X variable that has <italic>l</italic> &#x0003D; 1, ..., <italic>L</italic> levels. Fitting the data with a simple nonparametric GP model,</p>
<disp-formula id="E9"><label>(6)</label><mml:math id="M20"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:mo>~</mml:mo><mml:mi>M</mml:mi><mml:mi>V</mml:mi><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x003BC;</mml:mi><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>K</mml:mi></mml:mstyle><mml:mo>+</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>&#x003C3;</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle></mml:msub><mml:msup><mml:mrow><mml:mtext>&#x000A0;</mml:mtext></mml:mrow><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>2</mml:mn></mml:mstyle></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>I</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where, <italic><bold>K</bold></italic> &#x0003D; (<italic>k</italic><sub><italic>ij</italic></sub>)<sub><italic>n</italic>&#x000D7;<italic>n</italic></sub>, with <italic>k</italic><sub><italic>ij</italic></sub> &#x0003D; 1 for <italic>X</italic><sub><italic>i</italic></sub> &#x0003D; <italic>X</italic><sub><italic>j</italic></sub> &#x0003D; <italic>l</italic>, indicating the pair is completely matched, and <italic>k</italic><sub><italic>ij</italic></sub> &#x0003D; 0 if <italic>X</italic><sub><italic>i</italic></sub>&#x02260;<italic>X</italic><sub><italic>j</italic></sub>, i.e., the pair is unmatched. Thus, the covariance function of the GPMatch model (Equation 6) is a block diagonal matrix where the <italic>l</italic><sup><italic>th</italic></sup> block matrix takes the form</p>
<disp-formula id="E10"><mml:math id="M21"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mstyle mathvariant="bold"><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>l</mml:mi></mml:mstyle></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mi>&#x003C3;</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>I</mml:mi></mml:mstyle><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>J</mml:mi></mml:mstyle><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>with <inline-formula><mml:math id="M22"><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>, &#x003C1; &#x0003D; 1/&#x003C3;<sup>2</sup> and <italic><bold>J</bold></italic><sub><italic>n</italic><sub><italic>l</italic></sub></sub> denotes the matrix of ones. The parameter estimates of the regression parameters can be derived by</p>
<disp-formula id="E11"><mml:math id="M23"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mrow><mml:mo>(</mml:mo><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003BC;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover></mml:mtd></mml:mtr></mml:mtable><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup></mml:mtd></mml:mtr></mml:mtable><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup></mml:mtd></mml:mtr></mml:mtable><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>It follows that the estimated average treatment effect is,</p>
<disp-formula id="E12"><label>(7)</label><mml:math id="M24"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>Y</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub></mml:mrow><mml:mrow><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:msub><mml:msubsup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mn>1</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle><mml:mo>&#x02032;</mml:mo></mml:msubsup><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>n</mml:mi></mml:mstyle></mml:mstyle></mml:msub></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Applying the Woodbury, Sherman &#x00026; Morrison formula, we see <bold>&#x003A3;</bold><sup>&#x02212;1</sup> is a block diagonal matrix of</p>
<disp-formula id="E13"><mml:math id="M25"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>&#x003A3;</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>l</mml:mi></mml:mstyle></mml:msub><mml:msup><mml:mrow><mml:mtext>&#x000A0;</mml:mtext></mml:mrow><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msup><mml:mi>&#x003C3;</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo>+</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mi>&#x003C1;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>+</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>I</mml:mi></mml:mstyle><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>J</mml:mi></mml:mstyle><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Let &#x00232;<sub><italic>l</italic>(<italic>a</italic>)</sub> denote the sample mean of outcome and <italic>n</italic><sub><italic>l</italic>(<italic>a</italic>)</sub> number of observations for the untreated (<italic>a</italic> &#x0003D; 0) and treatment group (<italic>a</italic> &#x0003D; 1) within the <italic>l</italic><sup><italic>th</italic></sup> subclass. The treatment effect can be expressed as a weighted sum of two quantities</p>
<disp-formula id="E14"><mml:math id="M26"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mo>&#x003BB;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>R</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:mo>&#x003BB;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>C</mml:mi></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M27"><mml:mrow><mml:mi>&#x003BB;</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>&#x003C1;</mml:mi><mml:mi>D</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>&#x003C1;</mml:mi><mml:mi>D</mml:mi><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>D</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M28"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>C</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> is the averaged treatment effect based on an average of the within-strata contrasts and <inline-formula><mml:math id="M29"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>C</mml:mi><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> is the effect coming from the contrast between the weighted average of treated and untreated samples. The subscripts of R and C correspond to the organization of the data table with strata as the row and treatment as the column.</p>
<disp-formula id="E15"><mml:math id="M30"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>C</mml:mi><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>C</mml:mi><mml:mn>2</mml:mn><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>D</mml:mi><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>D</mml:mi><mml:mn>2</mml:mn><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow></mml:mstyle><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p><inline-formula><mml:math id="M31"><mml:mrow><mml:msub><mml:mrow><mml:mi>q</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003C1;</mml:mi><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, <italic>n</italic><sub><italic>l</italic></sub> &#x0003D; <italic>n</italic><sub><italic>l</italic>(0)</sub>&#x0002B;<italic>n</italic><sub><italic>l</italic>(1)</sub> and the summations are over <italic>l</italic> &#x0003D; 1, ..., <italic>L</italic>. To gain better insight into this estimator, it should help to consider a few examples.</p>
<p><bold>Example 1. Matched twin experiment</bold>. Consider a matched twin experiment, where for each treated unit there is a untreated twin. Here, we have a 2<italic>n</italic>&#x000D7;2<italic>n</italic> block diagonal matrix <bold>&#x003A3;</bold><sub><bold>2</bold><italic><bold>n</bold></italic></sub> &#x0003D; <italic><bold>I</bold></italic><sub><italic>n</italic></sub>&#x02297;<italic><bold>J</bold></italic><bold><sub>2</sub></bold>&#x0002B;&#x003C3;<sub>0</sub><italic><bold>I</bold></italic><sub>2<italic><bold>n</bold></italic></sub>. Thus, <inline-formula><mml:math id="M32"><mml:mrow><mml:mi>&#x003C3;</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M33"><mml:mrow><mml:mi>&#x003C1;</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>, <italic>n</italic><sub><italic>l</italic></sub> &#x0003D; 2, <italic>n</italic><sub><italic>l</italic>(0)</sub> &#x0003D; <italic>n</italic><sub><italic>l</italic>(1)</sub> &#x0003D; 1. Substitute them into the treatment effect formula derived above, we have the same 1:1 matching estimator of treatment effect <inline-formula><mml:math id="M34"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>.</p>
<p><bold>Example 2. Cluster randomized experiment</bold>. Consider a cluster randomized experiment, where the true propensity of treatment assignment is known. Suppose the strata are equal-sized, <bold>&#x003A3;</bold> is a block diagonal matrix of <italic><bold>I</bold></italic><sub><italic><bold>L</bold></italic></sub>&#x02297;<italic><bold>J</bold></italic><sub><italic><bold>n</bold></italic></sub>&#x0002B;&#x003C3;<sub>0</sub><italic><bold>I</bold></italic><sub><italic><bold>n</bold></italic></sub>, where <italic>L</italic> is the total number of strata, the total sample size is <italic>N</italic> &#x0003D; <italic>Ln</italic>. It is straight forward to derive <inline-formula><mml:math id="M35"><mml:mrow><mml:mi>&#x003C3;</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>, <inline-formula><mml:math id="M36"><mml:mrow><mml:mi>&#x003C1;</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>, <italic>n</italic><sub><italic>l</italic></sub> &#x0003D; <italic>n</italic>, for <italic>l</italic> &#x0003D; 1, ..., <italic>L</italic>. Then the treatment effect is a weighted sum of <inline-formula><mml:math id="M37"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M38"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x00232;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>. Where the weight <inline-formula><mml:math id="M39"><mml:mrow><mml:mi>&#x003BB;</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mi>N</mml:mi><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> is a function of sample sizes and <inline-formula><mml:math id="M40"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>. We can see when <inline-formula><mml:math id="M41"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x02192;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> or <italic>n</italic><sub><italic>l</italic></sub> &#x02192; &#x0221E;, then &#x003BB; &#x02192; 1, <inline-formula><mml:math id="M42"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>&#x02192;</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. That is when the outcomes are measured without error, the treatment effect is a weighted average of &#x00232;<sub><italic>l</italic>(1)</sub>&#x02212;&#x00232;<sub><italic>l</italic>(0)</sub>, i.e., the group mean difference for each stratum. As <inline-formula><mml:math id="M43"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> increase, &#x003BB; decrease, then the estimate of &#x003C4; puts more weights on <inline-formula><mml:math id="M44"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>C</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. In other words, the GP estimate of treatment is a shrinkage estimator, where it shrinks the strata-level treatment effect more toward the overall sample mean difference when the outcome variance is larger.</p>
<p><bold>Example 3. A simple observational study</bold>. Consider a binary covariate <italic>X</italic> &#x0003D; 0, 1, where the treatment is assigned differential based on X, <italic>Pr</italic>(<italic>A</italic><sub><italic>i</italic></sub> &#x0003D; 1|<italic>X</italic><sub><italic>i</italic></sub> &#x0003D; <italic>x</italic>) &#x0003D; &#x003C0;(<italic>x</italic>). The frequency table of the observed data is shown in the <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Data table for Example 3.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="center"><bold><italic>A</italic> &#x0003D; 0</bold></th>
<th valign="top" align="center"><bold><italic>A</italic> &#x0003D; 1</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><italic>X</italic> &#x0003D; 0</td>
<td valign="top" align="center"><italic>n</italic><sub>0(0)</sub></td>
<td valign="top" align="center"><italic>n</italic><sub>0(1)</sub></td>
</tr> <tr>
<td valign="top" align="left"><italic>X</italic> &#x0003D; 1</td>
<td valign="top" align="center"><italic>n</italic><sub>1(0)</sub></td>
<td valign="top" align="center"><italic>n</italic><sub>1(1)</sub></td>
</tr></tbody>
</table>
</table-wrap>
<p>The treatment effect can be derived based on Equation (7). When <inline-formula><mml:math id="M45"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x02192;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>, then &#x003BB; &#x02192; 1, <inline-formula><mml:math id="M46"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>&#x02192;</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, we have</p>
<disp-formula id="E16"><mml:math id="M47"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mi>C</mml:mi><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>Y</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>D</mml:mi><mml:mn>1</mml:mn><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>0</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub><mml:msub><mml:mi>n</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mo stretchy='false'>(</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>We can derive</p>
<disp-formula id="E17"><mml:math id="M48"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>C</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>(</mml:mo><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>0</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mi>n</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>(</mml:mo><mml:mi>X</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In general, for multiple levels of <italic>X</italic>, the treatment effect is a weighted average of the treatment effect &#x003C4;<sub><italic>l</italic></sub> &#x0003D; <italic>E</italic>(<italic>Y</italic>(1)&#x02212;<italic>Y</italic>(0)|<italic>X</italic> &#x0003D; <italic>l</italic>),</p>
<disp-formula id="E18"><mml:math id="M49"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mover accent='true'><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle displaystyle='true'><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>&#x003C4;</mml:mi><mml:mi>l</mml:mi></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover></mml:mrow></mml:mstyle></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the weight <italic>w</italic><sub><italic>i</italic></sub> is determined by the variance of <italic>Pr</italic>(<italic>A</italic> &#x0003D; 1|<italic>X</italic> &#x0003D; <italic>l</italic>) &#x0003D; <italic>n</italic><sub><italic>l</italic></sub>&#x003C0;<sub><italic>l</italic></sub>(1&#x02212;&#x003C0;<sub><italic>l</italic></sub>), with &#x003C0;<sub><italic>l</italic></sub> &#x0003D; 0.5 receiving maximum possible weight. On the other hand, for the subgroup where &#x003C0;<sub><italic>l</italic></sub> is very small or very large, they contribute very little to the overall averaged treatment effect. When there are non-ignorable noises <inline-formula><mml:math id="M50"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0003E;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>, again the treatment effect is a shrinkage estimate of the weighted average of the heterogeneous treatment effects, shrinking toward the overall contrast between the treated and untreated groups.</p>
<p>The above demonstration was presented by considering a categorical X, with K being a block diagonal matrix of 0 and 1 s. For general types of <italic>X</italic>, the squared exponential covariance function offers a way to specify a distance matching, which closely resembles Mahalanobis distance matching. For a pair of &#x0201C;matched&#x0201D; individuals, i.e., sample units with the same set of confounding variables <italic><bold>v</bold></italic><sub><italic><bold>i</bold></italic></sub> &#x0003D; <italic><bold>v</bold></italic><sub><italic><bold>j</bold></italic></sub>, the model specifies <inline-formula><mml:math id="M51"><mml:mrow><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:math></inline-formula>. In other words, the &#x0201C;matched&#x0201D; individuals are expected to be exchangeable. As the data points move further apart in the covariate space, their correlation becomes smaller. When the distance is far apart sufficiently, the model specifies <inline-formula><mml:math id="M52"><mml:mrow><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02248;</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula> or &#x0201C;unmatched.&#x0201D; Distinct length scale parameters are used to allow for some confounders to play more important roles than others in matching. By manipulating the values of <italic>v</italic><sub><italic>i</italic></sub> and the corresponding length scale parameter, one could formulate the SE covariance matrix to reflect the known 0/1 or various degrees of matching structure. However, the matching structure is usually unknown and was left to be estimated in the model informed by the observed data. Unlike the propensity score or other distance matching method, using the GP covariance function as the matching tool provides a flexible and data-driving way of defining &#x0201C;similarity&#x0201D; between any pairs of data points, and thus offer more weights to the &#x0201C;similar&#x0201D; data points in a finer gradient.</p>
</sec>
<sec>
<title>3.2. Doubly robust property</title>
<p>Causal inference estimators with the doubly robust (DR) property are particularly attractive given their ability to address the dual data-generating processes underlying the causal inference problem. Multiple versions of DR causal estimators (e.g., Scharfstein et al. [<xref ref-type="bibr" rid="B29">29</xref>], Bang and Robins [<xref ref-type="bibr" rid="B30">30</xref>], and Chernozhukov et al. [<xref ref-type="bibr" rid="B31">31</xref>]) have been proposed. They all can be considered as a contrast between two weighted terms of treatment groups, and their DR properties are established under the conditions of the correct specification of either the outcome regression model or the propensity score. Such an argument is not straightforward within the Bayesian framework, although there have been new developments emerging that linked empirical likelihood with estimating equations for parameter estimations, as well as constructing Bayesian methods for models formulated through moment restrictions (e.g., Schennach [<xref ref-type="bibr" rid="B32">32</xref>], Chib et al. [<xref ref-type="bibr" rid="B33">33</xref>], Florens and Simoni [<xref ref-type="bibr" rid="B34">34</xref>], and Luo et al. [<xref ref-type="bibr" rid="B35">35</xref>]).</p>
<p>We conjecture that the GPMatch possesses the DR properties asymptotically in the following sense. Let the true average treatment effect (ATE) be &#x003C4;<sup>&#x0002A;</sup>, the GPMatch estimator is an unbiased estimate of the ATE when either one of the conditions is true: (i) the GP mean function <inline-formula><mml:math id="M53"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:mrow></mml:math></inline-formula> is correctly specified; or (ii) the GP covariance function is correctly specified, in the sense that, from the weight-space point of view of GP regression, the weighted sum of treatment assignment consistently estimates the true treatment propensity &#x003C0;<sub><italic>i</italic></sub> &#x0003D; <italic>Pr</italic>(<italic>A</italic> &#x0003D; 1|<italic>X</italic><sub><italic>i</italic></sub>).</p>
<p>Under the condition (i), the partial linear component of the <inline-formula><mml:math id="M54"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is correctly specified, we may apply the results of Theorem 1 of Choi and Woo [<xref ref-type="bibr" rid="B24">24</xref>], which suggests that the posteriors of the GPMatch parameters can be consistently estimated. It follows that the averaged treatment effect can be consistently estimated.</p>
<p>The second condition assumes a known GP prior. We consider a simple misspecification of the form <italic>E</italic>(<italic>Y</italic><sub><italic>i</italic></sub>) &#x0003D; <italic>f</italic><sub><italic>i</italic></sub>(<italic>x</italic>)&#x0002B;<italic>A</italic><sub><italic>i</italic></sub>&#x003C4;. From the weight-space point of view, given &#x003C4;, the predicted value of the potential outcome from the GPMatch model can be asymptotically approximated by</p>
<disp-formula id="E19"><label>(8)</label><mml:math id="M55"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mstyle displaystyle='false'><mml:munderover><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:munderover><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>j</mml:mi></mml:msub><mml:mi>&#x003C4;</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mstyle><mml:mo>+</mml:mo><mml:mi>a</mml:mi><mml:mi>&#x003C4;</mml:mi><mml:mo>=</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x002DC;</mml:mo></mml:mover><mml:mo>+</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x002DC;</mml:mo></mml:mover><mml:mo stretchy='false'>)</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M56"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M57"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x000C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, for <italic>i</italic> &#x0003D; 1, ..., <italic>n</italic>. The weight <inline-formula><mml:math id="M58"><mml:mrow><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> where <inline-formula><mml:math id="M59"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle mathvariant="bold"><mml:mi>k</mml:mi></mml:mstyle><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold"><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>&#x003A3;</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, with <italic><bold>k</bold></italic>(<italic><bold>v</bold></italic><sub><italic><bold>j</bold></italic></sub>) &#x0003D; (<italic>k</italic>(<sub><italic><bold>v</bold></italic><sub><italic><bold>j</bold></italic></sub>, <italic><bold>v</bold></italic><sub><italic><bold>i</bold></italic></sub>))<italic>n</italic>&#x000D7;1</sub>. Thus, the <inline-formula><mml:math id="M60"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M61"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> could be considered as the Nadaraya-Watson estimator of the observed outcomes and treatment assignment for each of the i-th unit in the sample. The estimate of treatment effect could be obtained by solving <inline-formula><mml:math id="M62"><mml:mrow><mml:mfrac><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mi>&#x02202;</mml:mi><mml:mi>&#x003C4;</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>. We can see that, given a known GP covariance function, the GPMatch treatment effect <inline-formula><mml:math id="M63"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow></mml:math></inline-formula> is an M-estimator satisfies <inline-formula><mml:math id="M64"><mml:mrow><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mtext>&#x003A8;</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>, where</p>
<disp-formula id="E20"><label>(9)</label><mml:math id="M65"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>&#x003A8;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mover accent='true'><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mo stretchy='true'>&#x002DC;</mml:mo></mml:mover><mml:mo>&#x02212;</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>A</mml:mi><mml:mo>&#x002DC;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>A</mml:mi><mml:mo>&#x002DC;</mml:mo></mml:mover><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Let the true propensity be &#x003C0;<sub><italic>i</italic></sub> &#x0003D; <italic>Pr</italic>(<italic>A</italic> &#x0003D; 1|<italic>X</italic><sub><italic>i</italic></sub>), given the SUTVEA, we have <inline-formula><mml:math id="M66"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>. Given the true treatment effect &#x003C4;<sup>&#x0002A;</sup>, we can write <inline-formula><mml:math id="M67"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. Thus, when <inline-formula><mml:math id="M68"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> asymptotically, we have <inline-formula><mml:math id="M69"><mml:mrow><mml:msub><mml:mrow><mml:mtext>&#x003A8;</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003C4;</mml:mi><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. It follows the estimating function is conditionally unbiased, i.e., <inline-formula><mml:math id="M70"><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mtext>&#x003A8;</mml:mtext></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mo>*</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>, for <italic>i</italic> &#x0003D; 1, ...<italic>n</italic>.</p>
<p>Remark 1. First, the Equation (9) is the empirical correlation of the residuals from the outcome model and the residuals from the propensity of treatment assignment. Thus, GPMatch attempts to induce independence between the treatment selection process and the outcome modeling, just as the G-estimation equation does (see Robins et al. [<xref ref-type="bibr" rid="B36">36</xref>] and Vansteelandt and Joffe [<xref ref-type="bibr" rid="B37">37</xref>]). Unlike the moment-based G-estimator, which requires the fitting of two separate models for the outcome and propensity score, the GPMatch approach estimates covariance parameters at the same time as it estimates the treatment and mean function parameters. All within a full Bayesian likelihood framework.</p>
<p>Second, some data points may have a treatment propensity close to 0 or 1. Those data usually are a cause of concern in causal inference. In the naive regression type of model such as BART, it may cause unstable estimation without added regularization. In the inverse probability treatment weighting type of method, a few data points may put undue influence over the estimation of treatment effect. In matching methods, these data points often are discarded. Such practice could lead to the sample no longer being representative of the target population. Like the G-estimation, we can see from the Equation (9), these data points contribute very little or no information to the GPMatch estimation of the treatment effect. Thus GPMatch shares the same added robustness as the G-estimation.</p>
<p>Third, the GPMatch model with a parametric mean function can be used in predicting the potential outcomes for any new unit, by <inline-formula><mml:math id="M71"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:msubsup><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle><mml:mo>&#x0002B;</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>&#x003A3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mi>&#x003A3;</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mo>-</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>Z</mml:mi><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, where <bold>&#x003A3;</bold><sub><italic><bold>i</bold></italic></sub> denotes the i-th row of <bold>&#x003A3;</bold>. Given the model setup, two regression surfaces are predicted, where the distance between the two regression surfaces represents the treatment effect. By including the treatment by covariate interactions, the model could offer estimates of conditional averaged treatment effects for pre-specified patient characteristics.</p>
<p>Finally, in real-world applications, we may never know the true functional form of neither the mean nor the covariance function. The only exception is the designed experimental study, where propensity scores are known. When the true propensity score is known, it can be directly used for specifying GP prior. With high dimensional X, we may wish to reduce dimensions first. One approach is to estimate summary scores, such as the estimated propensity score. Another approach is to engage variable selection procedures. As in the propensity-score-based methods, we wish to design the covariance function to ensure covariate balance between the treatment groups. Given the fitted <inline-formula><mml:math id="M72"><mml:mrow><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> model, covariate balance can be diagnosed by comparing weighted samples of <inline-formula><mml:math id="M73"><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mo>|</mml:mo><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M74"><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>X</mml:mi><mml:mo>|</mml:mo><mml:mi>A</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">M</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>G</mml:mi><mml:mi>P</mml:mi><mml:mi>M</mml:mi><mml:mi>a</mml:mi><mml:mi>t</mml:mi><mml:mi>c</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> (see an example in Huang et al. [<xref ref-type="bibr" rid="B38">38</xref>]).</p>
</sec>
</sec>
<sec id="s4">
<title>4. Simulation studies</title>
<p>To empirically evaluate the performances of GPMatch in a real-world setting where neither the matching structure nor the functional form of the outcome model are known, we conducted four sets of simulation studies to evaluate the performances of the GPMatch approach to causal inference. The first set evaluated the frequentist performance of GPMatch. The second set compared the performance of GPMatch against MD match, the third set considered a setting with a large number of correleted background variables where only a few are relevant to the data generating mechanisms and the last set utilized the widely used Kang and Schafer design, comparing the performance of GPMatch against some commonly used propensity methods as well as the nonparametric Bayesian additive regression tree (BART) method.</p>
<p>In all simulation studies, the GPMatch approach used a squared exponential covariate function, including only the treatment indicator in the mean and all observed covariates into the covariance function, unless otherwise noted. The results were compared with the following widely used causal inference methods: sub-classification by PS quantile (QNT-PS); augmented inverse probability of treatment weighting (AIPTW), a linear model with PS adjustment (LM-PS), a linear model with spline fit PS adjustment [LM-sp(PS)] and BART. Cubic B-splines with knots based on quantiles of PS were used for LM-sp(PS). We also considered the direct linear regression model (LM) as a comparison. The ATE estimates were obtained by averaging over 5000 posterior MCMC draws, after 5,000 burn-in. For each scenario, three sample sizes were considered, <italic>N</italic> = 100, 200, and 400. The standard error and the 95% symmetric interval estimate of ATE for each replicate were calculated from the 5,000 MCMC chain. For comparing performances of different methods, all results were summarized over <italic>N</italic> = 100 replicates by the root mean square error RMSE <inline-formula><mml:math id="M75"><mml:mrow><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mo>&#x02211;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula>, median absolute error MAE <inline-formula><mml:math id="M76"><mml:mrow><mml:mo>=</mml:mo><mml:mi>m</mml:mi><mml:mi>e</mml:mi><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mo>&#x02223;</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>&#x003C4;</mml:mi><mml:mo>&#x02223;</mml:mo></mml:mrow></mml:math></inline-formula>, coverage rate Rc = (the number of intervals that include &#x003C4;)/<italic>N</italic> of the 95% symmetric posterior interval, the averaged standard error estimate <inline-formula><mml:math id="M77"><mml:mrow><mml:mi>S</mml:mi><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></inline-formula>, where <inline-formula><mml:math id="M78"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> is the square root of the estimated standard deviation of <inline-formula><mml:math id="M79"><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula>, and the standard error of ATE was calculated from 100 replicates <inline-formula><mml:math id="M80"><mml:mrow><mml:mi>S</mml:mi><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:mi>m</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mo>&#x02211;</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mover accent="true"><mml:mrow><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>N</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula>.</p>
<sec>
<title>4.1. Well-calibrated frequentist performances</title>
<p>Let the single covariate <italic>x</italic>&#x0007E;<italic>N</italic>(0, 1). The potential outcome was generated by <inline-formula><mml:math id="M81"><mml:mrow><mml:msup><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mi>x</mml:mi></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mi>U</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> for <italic>a</italic> &#x0003D; 0, 1, where the true treatment effect was 1&#x0002B;<italic>U</italic><sub><italic>i</italic></sub> for the i-th individual unit. The (<italic>U, U</italic><sub>0</sub>) are unobserved covariates. The treatment was selected for each individual following <italic>logit</italic>(<italic>P</italic>(<italic>A</italic> &#x0003D; 1|<italic>X</italic>)) &#x0003D; &#x02212;0.2&#x0002B;(1.8<italic>X</italic>)<sup>1/3</sup>. The observed outcome was generated by <inline-formula><mml:math id="M82"><mml:mrow><mml:mi>y</mml:mi><mml:mo>|</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>. Two parameter settings were considered. First, we set <inline-formula><mml:math id="M83"><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>25</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>75</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, i.e., all individual units had the same uniform treatment effect of 1, and outcomes were observed with measurement error. Second, we set <inline-formula><mml:math id="M84"><mml:mrow><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, i.e., the treatment effect varied from individual unit to unit, but the averaged treatment effect remained at 1.</p>
<p>The simulation results were summarized in the histogram of the posterior mean over the 100 replicates across three sample sizes in <xref ref-type="fig" rid="F1">Figure 1</xref>. <xref ref-type="table" rid="T2">Table 2</xref> presented the results of GPMatch and the Oracle standard. The Oracle estimate was obtained by fitting the true outcome-generating model for benchmark. For both <xref ref-type="fig" rid="F1">Figure 1</xref> and <xref ref-type="table" rid="T2">Table 2</xref>, the upper panel presented results from the uniform treatment parameter setting, and the lower panel presented the results from the homogeneous treatment setting. Under both settings, GPMatch presented well-calibrated frequentist properties with nominal coverage rate, and only slightly larger RMSE. The averaged bias, RMSE, and MAE quickly improve as sample size increases, and its performance is almost as well as the Oracle with a sample size of 400.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Distribution of the GPMatch estimate of ATE, by different sample sizes under the single covariate simulation study setting.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0001.tif"/>
</fig>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Results of ATE estimates under the single covariate simulation study setting.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Sample size</bold></th>
<th valign="top" align="center"><bold>RMSE</bold></th>
<th valign="top" align="center"><bold>MAE</bold></th>
<th valign="top" align="center"><bold>Bias</bold></th>
<th valign="top" align="center"><bold>Rc</bold></th>
<th valign="top" align="center"><bold><italic>SE</italic><sub><italic>avg</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>SE</italic><sub><italic>emp</italic></sub></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="8" style="background-color:#dee1e1"><inline-formula><mml:math id="M85"><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>25</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>75</mml:mn></mml:mrow></mml:math></inline-formula></td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">Oracle</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.243</td>
<td valign="top" align="center">0.165</td>
<td valign="top" align="center">&#x02013;0.066</td>
<td valign="top" align="center">0.930</td>
<td valign="top" align="center">0.216</td>
<td valign="top" align="center">0.235</td>
</tr>
 <tr>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.149</td>
<td valign="top" align="center">0.109</td>
<td valign="top" align="center">0.027</td>
<td valign="top" align="center">0.940</td>
<td valign="top" align="center">0.150</td>
<td valign="top" align="center">0.147</td>
</tr>
 <tr>
<td valign="top" align="center">400</td>
<td valign="top" align="center">0.123</td>
<td valign="top" align="center">0.087</td>
<td valign="top" align="center">&#x02013;0.007</td>
<td valign="top" align="center">0.930</td>
<td valign="top" align="center">0.107</td>
<td valign="top" align="center">0.123</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">GPMatch</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.260</td>
<td valign="top" align="center">0.160</td>
<td valign="top" align="center">&#x02013;0.038</td>
<td valign="top" align="center">0.93</td>
<td valign="top" align="center">0.242</td>
<td valign="top" align="center">0.258</td>
</tr>
 <tr>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.161</td>
<td valign="top" align="center">0.116</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.167</td>
<td valign="top" align="center">0.159</td>
</tr>
 <tr>
<td valign="top" align="center">400</td>
<td valign="top" align="center">0.122</td>
<td valign="top" align="center">0.085</td>
<td valign="top" align="center">&#x02013;0.005</td>
<td valign="top" align="center">0.96</td>
<td valign="top" align="center">0.118</td>
<td valign="top" align="center">0.123</td>
</tr> <tr>
<td valign="top" align="left" colspan="8" style="background-color:#dee1e1"><inline-formula><mml:math id="M86"><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula></td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">Oracle</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.220</td>
<td valign="top" align="center">0.134</td>
<td valign="top" align="center">&#x02013;0.011</td>
<td valign="top" align="center">0.92</td>
<td valign="top" align="center">0.213</td>
<td valign="top" align="center">0.221</td>
</tr>
 <tr>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.159</td>
<td valign="top" align="center">0.098</td>
<td valign="top" align="center">0.001</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.151</td>
<td valign="top" align="center">0.159</td>
</tr>
 <tr>
<td valign="top" align="center">400</td>
<td valign="top" align="center">0.107</td>
<td valign="top" align="center">0.077</td>
<td valign="top" align="center">&#x02013;0.003</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.107</td>
<td valign="top" align="center">0.108</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">GPMatch</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.237</td>
<td valign="top" align="center">0.152</td>
<td valign="top" align="center">0.013</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">0.244</td>
<td valign="top" align="center">0.238</td>
</tr>
 <tr>
<td valign="top" align="center">200</td>
<td valign="top" align="center">0.175</td>
<td valign="top" align="center">0.114</td>
<td valign="top" align="center">0.007</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.169</td>
<td valign="top" align="center">0.175</td>
</tr>
 <tr>
<td valign="top" align="center">400</td>
<td valign="top" align="center">0.117</td>
<td valign="top" align="center">0.084</td>
<td valign="top" align="center">0.001</td>
<td valign="top" align="center">0.96</td>
<td valign="top" align="center">0.117</td>
<td valign="top" align="center">0.118</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>RMSE, root mean square error; MAE, median absolute error; Bias, Estimate-True; Rc, Rate of coverage by the 95% interval estimate; <italic>SE</italic><sub><italic>avg</italic></sub>, average of standard error estimate from all replicate; <italic>SE</italic><sub><italic>emp</italic></sub>, standard error of ATE estimates from all replicate; Oracle, Using the true outcome generating model; GPMatch, Bayesian marginal structural model with Gaussian process prior, only the treatment effect is included in the mean function; covariance function includes <italic>X</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>We also applied some commonly adopted causal inference methods as well as the BART to the simulated data. Their performances are presented as the %bias, the ratio of RMSE and MAE in reference to the oracle results in <xref ref-type="fig" rid="F2">Figure 2</xref>. The results show that the impact of measurement error varies by the method, whether the propensity score is correctly estimated, as well as the sample sizes. At sample size 100, even with correctly specified PS, the %bias ranges from 5 to 10% for PS-based methods, and the MAE and RMSE are at least 1.5 times the oracle estimates. Their performances improve with increased sample size if the propensity score is correctly specified. However, when the propensity score is miss-specified, the performance could get even worse with an increased sample size. Of all PS-based methods, flexible modeling LM-sp(PS) using spline fit of PS appears to perform the best. The two Bayesian flexible modeling techniques, BART and GP had the best performances w.r.t. MAE and RMSE, with BART performing nearly as well as GP when the sample size is <italic>N</italic> = 400. However, the %bias results from BART presented surprisingly larger %bias for <italic>N</italic> = 200 than <italic>N</italic> = 100. These results suggest that not explicitly acknowledging measurement error, the existing methods may suffer from bias.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Comparisons of percentage bias, root mean square error (RMSE), and median absolute error (MAE) of the ATE Estimates by Different Methods Across Different Sample Sizes under the Simulation Setting 1a <bold>(upper panel)</bold>: <inline-formula><mml:math id="M87"><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>25</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>75</mml:mn></mml:mrow></mml:math></inline-formula> and Setting 1b <bold>(lower panel)</bold>: <inline-formula><mml:math id="M88"><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow></mml:math></inline-formula>. <sup>1</sup>Propensity score estimated using logistic regression on <italic>logitA</italic>&#x0007E;<italic>X</italic>. <sup>2</sup>Propensity score estimated using logistic regression on <italic>logitA</italic>&#x0007E;<italic>X</italic><sup>1/3</sup>. GPMatch, Bayesian structural model with Gaussian process prior; QNT_PS, Propensity score sub-classification by quintiles; AIPTW, augmented inversed probability of treatment weighting; LM, linear regression modeling <italic>Y</italic>&#x0007E;<italic>X</italic>; LM_PS, linear regression modeling with propensity score adjustment; LM_sp(PS), linear regression modeling with spline fit propensity score adjustment. BART, Bayesian additive regression tree. *The ratios of RMSE of <italic>N</italic> = 200 and <italic>N</italic> = 400 for AIPTW<sup>1</sup> are 24.23 and 11.68 which are truncated.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0002.tif"/>
</fig>
</sec>
<sec>
<title>4.2. Compared to Mahalanobis distance matching</title>
<p>To compare the performances between the MD matching and GPMatch, we considered a simulation study with two independent covariates <italic>x</italic><sub>1</sub>, <italic>x</italic><sub>2</sub> from the uniform distribution <italic>U</italic>(&#x02212;2, 2), treatment was assigned by letting <italic>A</italic><sub><italic>i</italic></sub>&#x0007E;<italic>Ber</italic>(&#x003C0;<sub><italic>i</italic></sub>), where</p>
<disp-formula id="E21"><mml:math id="M89"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The potential outcomes were generated by</p>
<disp-formula id="E22"><mml:math id="M90"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>3</mml:mn><mml:mo>+</mml:mo><mml:mn>5</mml:mn><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mi>i</mml:mi></mml:mrow><mml:mn>3</mml:mn></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E23"><mml:math id="M91"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The true treatment effect is 5. Three different sample sizes were considered <italic>N</italic> = 100, 200, and 400. For each setting, 100 replicates were performed and the results were summarized.</p>
<p>We estimated ATE by applying Mahalanobis distance matching and GPMatch. The MD matching considered caliper varied from 0.125 to 1 with step size 0.025, including both <italic>X</italic><sub>1</sub> and <italic>X</italic><sub>2</sub> in the matching using the function Match in R package Matching by Sekhon [<xref ref-type="bibr" rid="B39">39</xref>]. The averaged bias and its 95%-tile and 5%-tile were presented as vertical lines corresponding to different calipers in <xref ref-type="fig" rid="F3">Figure 3</xref>. To be directly comparable to the matching approach, the GPMatch estimated the ATE by including the treatment effect only in modeling the mean function, both <italic>X</italic><sub>1</sub> and <italic>X</italic><sub>2</sub> were considered in the covariance function modeling. The posterior results were generated with 5,000 MCMC samples after 5,000 burn-in. Its averaged bias (short dashed horizontal line) and 5 and 95%-tiles of the ATE estimate (long dashed horizontal lines) are presented in <xref ref-type="fig" rid="F3">Figure 3</xref> for each of the sample sizes. Also presented in the figure are the bias, median absolute error (MAE), root mean square error (RMSE), and rate of coverage rate (Rc) summarized over 100 replicates of GPMatch. The bias from the matching method increases with the caliper; the width of the interval estimate varies by sample size and caliper. It reduces with increased caliper for a sample size of 100, but increases with increased caliper for a sample size of 400. In contrast, GPMatch produced a much more accurate and efficient estimate of ATE for all sample sizes, with an unbiased ATE estimate and nominal coverage rate. The 5 and 95%-tiles of ATE estimates are always smaller than those from the matching methods for all settings considered, suggesting better efficiency of GPMatch.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Simulation study results of comparing GPMatch with Mahalanobis distance matching methods. The circles are the averaged biases of estimates of ATE using Mahalanobis matching with corresponding calipers. The corresponding vertical lines indicate the ranges between the 5th and 95th percentiles of the biases. The horizontal lines are the averaged ATE (short dashed line), and the 5th percentile and 95th percentile (long dashed line) of the biases of the estimates from GPMatch.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0003.tif"/>
</fig>
</sec>
<sec>
<title>4.3. High dimension covariates</title>
<p>The background covariates could be of high dimension. While the GP prior could include high dimensional X, the computational burden can be too demanding. To address the issue, we considered two-dimensional reduction strategies. First, we use the estimated propensity score in constructing the GP covariance function, where the PS is obtained by a logistic regression on all covariates. Second, we engaged a standard stepwise selection procedure for the logistic regression modeling of treatment selection prior to the GP modeling, where only selected variables are included in the GP covariance function. Here, we simply used the default setting of the variable selection procedure implemented in the standard R step function. At last, for comparison, we generated the propensity score using the true logistic model.</p>
<p>Modified from the simulation setting presented in Section 4.2, we considered 25 dependent covariates <italic>X</italic><sub>1</sub>, ..., <italic>X</italic><sub>25</sub> generated from a multivariate normal distribution with mean 0, variance 1, and the correlation <inline-formula><mml:math id="M92"><mml:mrow><mml:mi>C</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:msup><mml:mrow><mml:mn>5</mml:mn></mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mi>i</mml:mi><mml:mo>-</mml:mo><mml:mi>j</mml:mi><mml:mo>|</mml:mo></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>. The treatment <italic>A</italic><sub><italic>i</italic></sub> was generated from a Bernoulli distribution with probability &#x003C0;<sub><italic>i</italic></sub>, where</p>
<disp-formula id="E24"><mml:math id="M93"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>2</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The potential outcomes were generated by</p>
<disp-formula id="E25"><mml:math id="M94"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>3</mml:mn><mml:mo>+</mml:mo><mml:mn>5</mml:mn><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:msubsup><mml:mi>x</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mi>i</mml:mi></mml:mrow><mml:mn>3</mml:mn></mml:msubsup><mml:mo>+</mml:mo><mml:mn>2</mml:mn><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mn>3</mml:mn><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E26"><mml:math id="M95"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>~</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The true treatment effect is 5. We considered three different sample sizes: <italic>N</italic> = 100, 200, and 400. For each setting, 100 replicates were performed and the results were summarized. For comparison, we applied the Mahalanobis distance matching method using all <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>25</sub> and using only the true covariate set (<italic>X</italic><sub>1</sub>, <italic>X</italic><sub>2</sub>). The MD match considered caliper varied from 0.125 to 1 with step size 0.025. Same as Section 4.2, the Match function from the R package Matching is used.</p>
<p>The comparisons of MAE and RMSE of these methods are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. Without variable selection, both MD match and GPMatch presented large biases for the sample size of 100. The performance quickly improves as the sample size increases for GPMatch, but not so for the MD match. The variable selection procedure clearly enhanced the performance for GP, with results indistinguishable from those using the true PS when <italic>N</italic> = 400. GPMatch results are identical between the model with a true covariate set and the model with true PS.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Comparison of MAE and RMSE of Mahalanobis and GPMatch methods.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0004.tif"/>
</fig>
</sec>
<sec>
<title>4.4. Performance under dual misspecification</title>
<p>Following the well-known simulation design suggested by Kang and Schafer [<xref ref-type="bibr" rid="B27">27</xref>], covariates <italic>z</italic><sub>1</sub>, <italic>z</italic><sub>2</sub>, <italic>z</italic><sub>3</sub>, <italic>z</italic><sub>4</sub> were independently generated from the standard normal distribution <italic>N</italic>(0, 1). Treatment was assigned by <italic>A</italic><sub><italic>i</italic></sub>&#x0007E;<italic>Ber</italic>(&#x003C0;<sub><italic>i</italic></sub>), where</p>
<disp-formula id="E27"><mml:math id="M96"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mi>i</mml:mi><mml:mi>t</mml:mi><mml:msub><mml:mi>&#x003C0;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>0.5</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mn>0.25</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:mn>0.1</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>4</mml:mn></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The potential outcomes were generated for <italic>a</italic> &#x0003D; 0, 1 by</p>
<disp-formula id="E28"><mml:math id="M97"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mi>y</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>a</mml:mi><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>210</mml:mn><mml:mo>+</mml:mo><mml:mn>5</mml:mn><mml:mi>a</mml:mi><mml:mo>+</mml:mo><mml:mn>27.4</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>13.7</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>13.7</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>13.7</mml:mn><mml:msub><mml:mi>z</mml:mi><mml:mrow><mml:mi>i</mml:mi><mml:mn>4</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E29"><mml:math id="M98"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mi>Y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>~</mml:mo><mml:mi>N</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mi>y</mml:mi><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>A</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msup><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The true treatment effect is 5. To assess the performances of the methods under the dual misspecifications, the transformed covariates <inline-formula><mml:math id="M99"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mn>10</mml:mn><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>25</mml:mn></mml:mrow></mml:mfrac><mml:mo>&#x0002B;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>6</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula>, and <inline-formula><mml:math id="M100"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>4</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>z</mml:mi></mml:mrow><mml:mrow><mml:mn>4</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mn>20</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> were used in the model instead of <italic>z</italic><sub><italic>i</italic></sub>.</p>
<p>Two GPMatch models were considered: GPMatch1 modeled the treatment effect only and GPMatch2 modeled all four covariates <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> in the mean function model. Both included <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> with four distinct length scale parameters. The PS was estimated using two approaches including the logistic regression model on <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> and the covariate balancing propensity score method (CBPS, [<xref ref-type="bibr" rid="B40">40</xref>]) applied to <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub>. The results corresponding to both versions of PS were presented. Summaries over all replicates were presented in <xref ref-type="table" rid="T3">Table 3</xref>, and the RMSE and the MAE were plotted in <xref ref-type="fig" rid="F5">Figure 5</xref>, for all methods considered. As a comparison, the Oracle uses the true outcome generating model of <italic>Y</italic>&#x0007E;<italic>Z</italic><sub>1</sub>&#x02212;<italic>Z</italic><sub>4</sub> was also presented. Both GPMatch1 and GPMatch2 clearly outperform all the other causal inference methods in terms of bias, RMSE, MAE, Rc, and the <italic>SE</italic><sub><italic>ave</italic></sub> is closely matched to <italic>SE</italic><sub><italic>emp</italic></sub>. The ATE and the corresponding SE estimates improve quickly as the sample size increases for GPMatch. In contrast, the QNT_PS, AIPT, LM_PS, and LM_sp(PS) methods show little improvement over increased sample size, nor does the simple LM. Improvements in the performance of GPMatch over existing methods are clearly evident, with more than 5 times accuracy in RMSE and MAE compared to all the other methods except for BART. Even compared to the BART results, the improvement in MAE is nearly twice for GPMatch2 and about 1.5 times for GPMatch1. Similar results are evident in RMSE and averaged bias. The lower-than-nominal coverage rate is mainly driven by the remaining bias, which quickly reduces as the sample size increases. Additional results are presented in <xref ref-type="supplementary-material" rid="SM1">Supplementary Figure S1</xref>.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Results of ATE estimates using different methods under the Kang and Shafer dual misspecification setting.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Sample size</bold></th>
<th valign="top" align="center"><bold>RMSE</bold></th>
<th valign="top" align="center"><bold>MAE</bold></th>
<th valign="top" align="center"><bold>Bias</bold></th>
<th valign="top" align="center"><bold>Rc</bold></th>
<th valign="top" align="center"><bold><italic>SE</italic><sub><italic>avg</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>SE</italic><sub><italic>emp</italic></sub></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" rowspan="3">Oracle</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">0.224</td>
<td valign="top" align="center">0.150</td>
<td valign="top" align="center">0.011</td>
<td valign="top" align="center">0.95</td>
<td valign="top" align="center">0.225</td>
<td valign="top" align="center">0.225</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">0.171</td>
<td valign="top" align="center">0.125</td>
<td valign="top" align="center">&#x02013;0.015</td>
<td valign="top" align="center">0.94</td>
<td valign="top" align="center">0.163</td>
<td valign="top" align="center">0.171</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">0.102</td>
<td valign="top" align="center">0.063</td>
<td valign="top" align="center">&#x02013;0.015</td>
<td valign="top" align="center">0.96</td>
<td valign="top" align="center">0.112</td>
<td valign="top" align="center">0.102</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">GPMatch1</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">2.400</td>
<td valign="top" align="center">1.606</td>
<td valign="top" align="center">&#x02013;1.254</td>
<td valign="top" align="center">0.92</td>
<td valign="top" align="center">2.158</td>
<td valign="top" align="center">2.057</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">1.663</td>
<td valign="top" align="center">1.309</td>
<td valign="top" align="center">&#x02013;1.051</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">1.213</td>
<td valign="top" align="center">1.295</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">0.897</td>
<td valign="top" align="center">0.587</td>
<td valign="top" align="center">&#x02013;0.564</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">0.673</td>
<td valign="top" align="center">0.701</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">GPMatch2</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">1.977</td>
<td valign="top" align="center">1.358</td>
<td valign="top" align="center">&#x02013;0.940</td>
<td valign="top" align="center">0.91</td>
<td valign="top" align="center">1.672</td>
<td valign="top" align="center">1.748</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">1.375</td>
<td valign="top" align="center">1.083</td>
<td valign="top" align="center">&#x02013;0.809</td>
<td valign="top" align="center">0.82</td>
<td valign="top" align="center">0.980</td>
<td valign="top" align="center">1.117</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">0.761</td>
<td valign="top" align="center">0.484</td>
<td valign="top" align="center">&#x02013;0.432</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">0.567</td>
<td valign="top" align="center">0.629</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">QNT_PS<xref ref-type="table-fn" rid="TN1"><sup><italic>a</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">7.574</td>
<td valign="top" align="center">6.483</td>
<td valign="top" align="center">&#x02013;6.234</td>
<td valign="top" align="center">0.970</td>
<td valign="top" align="center">7.641</td>
<td valign="top" align="center">4.324</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">7.408</td>
<td valign="top" align="center">6.559</td>
<td valign="top" align="center">&#x02013;6.615</td>
<td valign="top" align="center">0.860</td>
<td valign="top" align="center">5.199</td>
<td valign="top" align="center">3.353</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">7.142</td>
<td valign="top" align="center">6.907</td>
<td valign="top" align="center">&#x02013;6.797</td>
<td valign="top" align="center">0.500</td>
<td valign="top" align="center">3.576</td>
<td valign="top" align="center">2.203</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">QNT_PS<xref ref-type="table-fn" rid="TN2"><sup><italic>b</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">8.589</td>
<td valign="top" align="center">7.360</td>
<td valign="top" align="center">&#x02013;7.177</td>
<td valign="top" align="center">0.970</td>
<td valign="top" align="center">7.541</td>
<td valign="top" align="center">4.744</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">8.713</td>
<td valign="top" align="center">8.121</td>
<td valign="top" align="center">&#x02013;7.964</td>
<td valign="top" align="center">0.720</td>
<td valign="top" align="center">5.214</td>
<td valign="top" align="center">3.550</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">8.909</td>
<td valign="top" align="center">7.980</td>
<td valign="top" align="center">-8.399</td>
<td valign="top" align="center">0.300</td>
<td valign="top" align="center">3.607</td>
<td valign="top" align="center">2.987</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">LM</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">6.442</td>
<td valign="top" align="center">5.183</td>
<td valign="top" align="center">&#x02013;5.556</td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">3.571</td>
<td valign="top" align="center">3.277</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">6.906</td>
<td valign="top" align="center">6.226</td>
<td valign="top" align="center">&#x02013;6.375</td>
<td valign="top" align="center">0.28</td>
<td valign="top" align="center">2.547</td>
<td valign="top" align="center">2.668</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">7.005</td>
<td valign="top" align="center">6.649</td>
<td valign="top" align="center">&#x02013;6.702</td>
<td valign="top" align="center">0.04</td>
<td valign="top" align="center">1.796</td>
<td valign="top" align="center">2.048</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">AIPTW<xref ref-type="table-fn" rid="TN1"><sup><italic>a</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">5.927</td>
<td valign="top" align="center">4.402</td>
<td valign="top" align="center">&#x02013;4.330</td>
<td valign="top" align="center">0.72</td>
<td valign="top" align="center">3.736</td>
<td valign="top" align="center">4.067</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">19.226</td>
<td valign="top" align="center">5.262</td>
<td valign="top" align="center">&#x02013;7.270</td>
<td valign="top" align="center">0.59</td>
<td valign="top" align="center">4.874</td>
<td valign="top" align="center">17.888</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">29.405</td>
<td valign="top" align="center">5.603</td>
<td valign="top" align="center">&#x02013;9.676</td>
<td valign="top" align="center">0.36</td>
<td valign="top" align="center">6.115</td>
<td valign="top" align="center">27.908</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">AIPTW<xref ref-type="table-fn" rid="TN2"><sup><italic>b</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">5.410</td>
<td valign="top" align="center">4.243</td>
<td valign="top" align="center">-3.659</td>
<td valign="top" align="center">0.77</td>
<td valign="top" align="center">3.780</td>
<td valign="top" align="center">4.005</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">5.780</td>
<td valign="top" align="center">5.075</td>
<td valign="top" align="center">&#x02013;4.950</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">2.712</td>
<td valign="top" align="center">2.999</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">6.204</td>
<td valign="top" align="center">5.482</td>
<td valign="top" align="center">&#x02013;5.652</td>
<td valign="top" align="center">0.24</td>
<td valign="top" align="center">2.105</td>
<td valign="top" align="center">2.569</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">LM_PS<xref ref-type="table-fn" rid="TN1"><sup><italic>a</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">5.103</td>
<td valign="top" align="center">3.832</td>
<td valign="top" align="center">&#x02013;4.091</td>
<td valign="top" align="center">0.74</td>
<td valign="top" align="center">3.420</td>
<td valign="top" align="center">3.066</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">5.392</td>
<td valign="top" align="center">4.648</td>
<td valign="top" align="center">&#x02013;4.793</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">2.452</td>
<td valign="top" align="center">2.483</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">5.091</td>
<td valign="top" align="center">5.128</td>
<td valign="top" align="center">&#x02013;4.787</td>
<td valign="top" align="center">0.19</td>
<td valign="top" align="center">1.706</td>
<td valign="top" align="center">1.741</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">LM_PS<xref ref-type="table-fn" rid="TN2"><sup><italic>b</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">5.451</td>
<td valign="top" align="center">4.156</td>
<td valign="top" align="center">&#x02013;4.528</td>
<td valign="top" align="center">0.72</td>
<td valign="top" align="center">3.427</td>
<td valign="top" align="center">3.051</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">5.891</td>
<td valign="top" align="center">4.981</td>
<td valign="top" align="center">&#x02013;5.278</td>
<td valign="top" align="center">0.46</td>
<td valign="top" align="center">2.466</td>
<td valign="top" align="center">2.631</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">5.585</td>
<td valign="top" align="center">5.452</td>
<td valign="top" align="center">&#x02013;5.272</td>
<td valign="top" align="center">0.13</td>
<td valign="top" align="center">1.726</td>
<td valign="top" align="center">1.852</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">LM_sp(PS)<xref ref-type="table-fn" rid="TN1"><sup><italic>a</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">4.809</td>
<td valign="top" align="center">3.161</td>
<td valign="top" align="center">&#x02013;3.598</td>
<td valign="top" align="center">0.79</td>
<td valign="top" align="center">3.165</td>
<td valign="top" align="center">3.207</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">4.982</td>
<td valign="top" align="center">4.152</td>
<td valign="top" align="center">&#x02013;4.266</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">2.250</td>
<td valign="top" align="center">2.587</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">4.470</td>
<td valign="top" align="center">4.038</td>
<td valign="top" align="center">&#x02013;4.127</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">1.559</td>
<td valign="top" align="center">1.727</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">LM_sp(PS)<xref ref-type="table-fn" rid="TN2"><sup><italic>b</italic></sup></xref></td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">4.984</td>
<td valign="top" align="center">3.619</td>
<td valign="top" align="center">&#x02013;3.806</td>
<td valign="top" align="center">0.77</td>
<td valign="top" align="center">3.095</td>
<td valign="top" align="center">3.233</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">5.237</td>
<td valign="top" align="center">4.374</td>
<td valign="top" align="center">&#x02013;4.507</td>
<td valign="top" align="center">0.51</td>
<td valign="top" align="center">2.248</td>
<td valign="top" align="center">2.681</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">4.856</td>
<td valign="top" align="center">4.484</td>
<td valign="top" align="center">&#x02013;4.494</td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">1.585</td>
<td valign="top" align="center">1.851</td>
</tr> <tr>
<td valign="top" align="left" rowspan="3">BART</td>
<td valign="top" align="center">100</td>
<td valign="top" align="center">3.148</td>
<td valign="top" align="center">2.504</td>
<td valign="top" align="center">&#x02013;2.491</td>
<td valign="top" align="center">0.79</td>
<td valign="top" align="center">2.163</td>
<td valign="top" align="center">1.935</td>
</tr>
 <tr>

<td valign="top" align="center">200</td>
<td valign="top" align="center">2.176</td>
<td valign="top" align="center">1.870</td>
<td valign="top" align="center">&#x02013;1.726</td>
<td valign="top" align="center">0.74</td>
<td valign="top" align="center">1.308</td>
<td valign="top" align="center">1.332</td>
</tr>
 <tr>

<td valign="top" align="center">400</td>
<td valign="top" align="center">1.283</td>
<td valign="top" align="center">0.942</td>
<td valign="top" align="center">&#x02013;0.997</td>
<td valign="top" align="center">0.71</td>
<td valign="top" align="center">0.757</td>
<td valign="top" align="center">0.812</td>
</tr></tbody>
</table>
<table-wrap-foot>
<fn id="TN1"><label>a</label><p>Propensity score estimated using logistic regression on <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub>.</p></fn>
<fn id="TN2"><label>b</label><p>Propensity score estimated using CBPS on <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub>.</p></fn>
<p>RMSE, root mean square error; MAE, median absolute error; Bias, Estimate-True; Rc, Rate of coverage by the 95% interval estimate; <italic>SE</italic><sub><italic>avg</italic></sub>, average of standard error estimate from all replicate; <italic>SE</italic><sub><italic>emp</italic></sub>, standard error of ATE estimates from all replicate.</p>
<p>GPMatch1-2: Bayesian structural model with Gaussian process prior. GPMatch1 including only treatment effect, and GPMatch2 including both the treatment effect and <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> in the mean function; both including <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> in the covariance function.</p>
<p>QNT_PS, Propensity score sub-classification by quintiles; AIPTW, augmented inversed probability of treatment weighting; LM, linear regression modeling <italic>Y</italic>&#x0007E;<italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub>; LM_PS, linear regression modeling with propensity score adjustment; LM_sp(PS), linear regression modeling with spline fit propensity score adjustment; BART, Bayesian additive regression tree.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>The RMSE and MAE of ATE Estimates using Different Methods under the Kang and Shafer Simulation Study Setting. GPMatch1-2: Bayesian structural model with Gaussian Process prior. GPMatch1 including only the treatment effect, and GPMatch2 including both the treatment effect and <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> in the mean function; and <italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub> are included in the covariance function. QNT_PS, Propensity score sub-classification by quintiles; AIPTW, augmented inverse probability of treatment weighting; LM, linear regression modeling <italic>Y</italic>&#x0007E;<italic>X</italic><sub>1</sub>&#x02212;<italic>X</italic><sub>4</sub>; LM_PS, linear regression modeling with propensity score adjustment; LM_sp(PS), linear regression modeling with spline fit propensity score adjustment.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0005.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<title>5. A case study</title>
<p>JIA is a chronic inflammatory disease, the most common autoimmune disease affecting the musculoskeletal organ system, and a major cause of childhood disability. The disease is relatively rare, with an estimated incidence rate of 12 per 100,000 child-year [<xref ref-type="bibr" rid="B41">41</xref>]. There are many treatment options. Currently, the two common approaches are the non-biologic disease-modifying anti-rheumatic drugs (DMARDs) and the biological DMARDs. Limited clinical evidence suggests that early aggressive use of biologic DMARDs may be more effective [<xref ref-type="bibr" rid="B42">42</xref>]. Utilizing data collected from a completed prospectively followed-up inception cohort research study [<xref ref-type="bibr" rid="B43">43</xref>], a retrospective chart review collected medication prescription records for study participants captured in the electronic health record system. This comparative study is aimed at understanding whether therapy using an early aggressive combination of non-biologic and biologic DMARDs is more effective than the more commonly adopted non-biologic DMARDs monotherapy in treating children with recently (less than 6 months) diagnosed polyarticular course of JIA. The study is approved by the investigator&#x00027;s institutional IRB.</p>
<p>The primary outcome is the Juvenile Arthritis Disease Activity Score (JADAS) after 6 months of treatment, a disease severity score calculated as the sum of four core clinical measures: physician&#x00027;s global assessment of disease activity (0&#x02013;10), patient&#x00027;s self-assessment of overall wellbeing (0&#x02013;10), erythrocyte sedimentation rate (ESR, standardized to 0&#x02013;10), and number of active joint counts (AJC, truncated to 0&#x02013;10). It ranges from 0 to 40, with 0 indicating no disease activity. Out of the 75 patients receiving either non-biological or the early combination of biological and non-biological DMARDs at baseline, 52 patients were treated by the non-biologic DMARDs and 23 were treated by the early aggressive combination DMARDs. The patients with longer disease duration, positive rheumatoid factor (RF) presence, higher pain visual analog scale (VAS) and lower baseline functional ability as measured by the childhood health assessment questionnaire (CHAQ), higher lost range of motion (LROM) and JADAS score are more likely to receive the biologic DMARDs prescription. The propensity score was derived using the CBPS method applied to the 11 pre-determined important baseline confounders. The derived PS was able to achieve the desired covariate balance within the 0.2 absolute standardized mean difference (<xref ref-type="fig" rid="F6">Figure 6</xref>).</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Balance check results for the cases study.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0006.tif"/>
</fig>
<p>We considered two GPMatch modeling approaches. The first included the full list of covariates. The second only included variables selected from the step-wise logistic regression modeling of treatment assignment. The following five variables: baseline JADAS, CHAQ, time since diagnosis, positive test of rheumatoid factor, and private insurance are selected. These five covariates, along with the binary treatment indicator, time of the 6-month follow-up since baseline were used in the partially linear mean function part of the GPMatch. For comparisons, PS methods also considered two corresponding sets of covariates in the outcome modeling when applicable. The results are presented in <xref ref-type="table" rid="T4">Table 4</xref> with the left panel presenting results using selected covariates, and the right panel presenting the results including the full set of covariates. With including selected variables, GPMatch obtained the average treatment effect of &#x02013;2.98 with standard error of 1.99, and the 95% credible interval of (&#x02013;6.91, 0.83). The results differs by &#x0003C; 0.5 point comparing the point estimates of two GPMatch models. The results are also similar for other PS-based methods, with BART showing more sensitivity to choices of covariates. <xref ref-type="fig" rid="F7">Figure 7</xref> presents the trace plot and histogram of the posterior distribution of the ATE estimate. The results suggest that the early aggressive combination of non-biologic and biologic DMARDs as the first line of treatment is more effective, leading to a nearly 3-point reduction in JADAS 6 months after treatment, compared to the non-biologic DMARDs treatment in children with a newly diagnosed disease. The results of ATE estimates by GPMatch, naive two-group comparison, and other existing causal inference methods are presented in <xref ref-type="table" rid="T4">Table 4</xref>. The LM, LM_PS, LM_sp(PS), and AIPTW include the same five covariates in the model along with the treatment indicator. BART used the treatment indicator and those covariates. While all results suggested the effectiveness of early aggressive use of biological DMARD, the naive, PS sub-classification by quintiles, and AIPTW suggested a much smaller ATE effect. The BART and PS adjusted linear regression produced results that were closer to the GPMatch results suggesting a 2 or 3 points reduction in the JADAS score if treated by the early aggressive combination DMARDs therapy. None of the results were statistically significant at the 2-sided 0.05 level.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Results of case study ATE estimates with none-matching methods.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th/>
<th valign="top" align="center" colspan="4"><bold>Selected covariates</bold></th>
<th valign="top" align="center" colspan="4"><bold>Full set of covariates</bold></th>
</tr>
<tr>
<th/>
</tr>
</thead>
<tbody>
 <tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<td valign="top" align="left"><bold>Method</bold></td>
<td valign="top" align="center"><bold>Estimate</bold></td>
<td valign="top" align="center"><bold>SD</bold></td>
<td valign="top" align="center"><bold>LL</bold></td>
<td valign="top" align="center"><bold>UL</bold></td>
<td valign="top" align="center"><bold>Estimate</bold></td>
<td valign="top" align="center"><bold>SD</bold></td>
<td valign="top" align="center"><bold>LL</bold></td>
<td valign="top" align="center"><bold>UL</bold></td>
</tr> <tr>
<td valign="top" align="left">Na&#x000EF;ve</td>
<td valign="top" align="center">&#x02013;0.338</td>
<td valign="top" align="center">1.973</td>
<td valign="top" align="center">&#x02013;4.205</td>
<td valign="top" align="center">3.529</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
</tr> <tr>
<td valign="top" align="left">QNT_PS</td>
<td valign="top" align="center">&#x02013;0.265</td>
<td valign="top" align="center">0.792</td>
<td valign="top" align="center">&#x02013;1.817</td>
<td valign="top" align="center">1.286</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">-</td>
</tr> <tr>
<td valign="top" align="left">AIPTW</td>
<td valign="top" align="center">&#x02013;0.589</td>
<td valign="top" align="center">2.809</td>
<td valign="top" align="center">&#x02013;6.094</td>
<td valign="top" align="center">4.916</td>
<td valign="top" align="center">&#x02013;0.324</td>
<td valign="top" align="center">2.959</td>
<td valign="top" align="center">&#x02013;6.124</td>
<td valign="top" align="center">5.476</td>
</tr> <tr>
<td valign="top" align="left">LM</td>
<td valign="top" align="center">&#x02013;2.761</td>
<td valign="top" align="center">2.044</td>
<td valign="top" align="center">&#x02013;6.767</td>
<td valign="top" align="center">1.245</td>
<td valign="top" align="center">&#x02013;3.127</td>
<td valign="top" align="center">2.010</td>
<td valign="top" align="center">&#x02013;7.067</td>
<td valign="top" align="center">0.812</td>
</tr> <tr>
<td valign="top" align="left">LM_PS</td>
<td valign="top" align="center">&#x02013;2.800</td>
<td valign="top" align="center">2.043</td>
<td valign="top" align="center">&#x02013;6.805</td>
<td valign="top" align="center">1.204</td>
<td valign="top" align="center">&#x02013;3.119</td>
<td valign="top" align="center">2.013</td>
<td valign="top" align="center">&#x02013;7.064</td>
<td valign="top" align="center">0.826</td>
</tr> <tr>
<td valign="top" align="left">LM_sp(PS)</td>
<td valign="top" align="center">&#x02013;1.930</td>
<td valign="top" align="center">2.261</td>
<td valign="top" align="center">&#x02013;6.362</td>
<td valign="top" align="center">2.501</td>
<td valign="top" align="center">&#x02013;2.072</td>
<td valign="top" align="center">2.234</td>
<td valign="top" align="center">&#x02013;6.450</td>
<td valign="top" align="center">2.305</td>
</tr> <tr>
<td valign="top" align="left">BART</td>
<td valign="top" align="center">&#x02013;1.838</td>
<td valign="top" align="center">1.618</td>
<td valign="top" align="center">&#x02013;4.903</td>
<td valign="top" align="center">1.434</td>
<td valign="top" align="center">&#x02013;0.942</td>
<td valign="top" align="center">1.406</td>
<td valign="top" align="center">&#x02013;3.845</td>
<td valign="top" align="center">1.636</td>
</tr> <tr>
<td valign="top" align="left">GPMatch</td>
<td valign="top" align="center">&#x02013;2.983</td>
<td valign="top" align="center">1.987</td>
<td valign="top" align="center">&#x02013;6.913</td>
<td valign="top" align="center">0.827</td>
<td valign="top" align="center">&#x02013;2.599</td>
<td valign="top" align="center">2.165</td>
<td valign="top" align="center">&#x02013;6.878</td>
<td valign="top" align="center">1.626</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>SD, standard deviation; LL, lower limit; UL, upper limit; Na&#x000EF;ve, Student-T two group comparisons; QNT_PS, Propensity score sub-classification by quintiles; AIPTW, augmented inversed probability of treatment weighting; LM, linear regression modeling <italic>Y</italic>&#x0007E;<italic>X</italic>; LM_PS, linear regression modeling with propensity score adjustment; LM_sp(PS), linear regression modeling with spline fit propensity score adjustment; BART, Bayesian additive regression tree; GPMatch, Bayesian structural model with Gaussian process prior.</p>
</table-wrap-foot>
</table-wrap>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Case study trace plot and histogram.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-09-1122114-g0007.tif"/>
</fig>
<p>We also applied the covariate matching method to the same dataset based on the same five baseline covariates. <xref ref-type="table" rid="T5">Table 5</xref> presents the results from using different calipers. As expected, as calipers narrow, the number of observations being discarded increases. Since only 10 patients had RF positive when the calipers &#x02264; 0.5, we cannot matching on RF positive anymore. Similarly, because of distributions of private insurance in the treated and untreated groups, we cannot match on the insurance when the caliper was set to 1 or smaller. Thus, for calipers &#x02264; 1, all subjects with private insurance were being excluded. When calipers were &#x02264; 0.5, all subjects with positive RF were excluded, and 50% of observations were discarded. When the calipers were set at 0.2, 67 out of 75 observations were discarded, rendering the results obtained from 8 observations only! The estimate of ATE was sensitive to the choices of calipers, which ranged from &#x02013;2.0 to &#x02013;4.28, making it difficult to interpret the study results.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Results of case study ATE estimates with matching method in case study.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">
<th valign="top" align="left"><bold>Caliper</bold></th>
<th/>
<th valign="top" align="center"><bold>2</bold></th>
<th valign="top" align="center"><bold>1</bold></th>
<th valign="top" align="center"><bold>0.8</bold></th>
<th valign="top" align="center"><bold>0.5</bold></th>
<th valign="top" align="center"><bold>0.4</bold></th>
<th valign="top" align="center"><bold>0.2</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="2">ATE</td>
<td valign="top" align="center">&#x02013;2.165</td>
<td valign="top" align="center">&#x02013;2.582</td>
<td valign="top" align="center">&#x02013;2.763</td>
<td valign="top" align="center">&#x02013;3.826</td>
<td valign="top" align="center">&#x02013;4.280</td>
<td valign="top" align="center">&#x02013;2.000</td>
</tr> <tr>
<td valign="top" align="left" colspan="2">SE</td>
<td valign="top" align="center">1.784</td>
<td valign="top" align="center">1.420</td>
<td valign="top" align="center">1.377</td>
<td valign="top" align="center">1.067</td>
<td valign="top" align="center">0.623</td>
<td valign="top" align="center">0.307</td>
</tr> <tr>
<td valign="top" align="left" colspan="2">&#x00023; of obs dropped</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">24</td>
<td valign="top" align="center">29</td>
<td valign="top" align="center">48</td>
<td valign="top" align="center">55</td>
<td valign="top" align="center">67</td>
</tr> <tr style="background-color:&#x00023;919498;color:&#x00023;ffffff">

<td valign="top" align="center"><bold>Before match</bold></td>
<td valign="top" align="center" colspan="7"><bold>After match</bold></td>
</tr> <tr style="background-color:#dee1e1">
<td valign="top" align="left" colspan="8"><bold>Standardized mean difference between two treatment groups</bold></td>
</tr> <tr>
<td valign="top" align="left">JADAS0</td>
<td valign="top" align="center">0.697</td>
<td valign="top" align="center">0.160</td>
<td valign="top" align="center">0.057</td>
<td valign="top" align="center">0.118</td>
<td valign="top" align="center">0.134</td>
<td valign="top" align="center">0.004</td>
<td valign="top" align="center">&#x02013;0.086</td>
</tr> <tr>
<td valign="top" align="left">Time diagnosed</td>
<td valign="top" align="center">0.080</td>
<td valign="top" align="center">&#x02013;0.101</td>
<td valign="top" align="center">&#x02013;0.121</td>
<td valign="top" align="center">&#x02013;0.117</td>
<td valign="top" align="center">&#x02013;0.036</td>
<td valign="top" align="center">&#x02013;0.031</td>
<td valign="top" align="center">0.094</td>
</tr> <tr>
<td valign="top" align="left">CHAQ</td>
<td valign="top" align="center">0.390</td>
<td valign="top" align="center">0.288</td>
<td valign="top" align="center">0.221</td>
<td valign="top" align="center">0.156</td>
<td valign="top" align="center">0.212</td>
<td valign="top" align="center">0.086</td>
<td valign="top" align="center">0.000</td>
</tr> <tr>
<td valign="top" align="left">RF positive</td>
<td valign="top" align="center">0.760</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">NA*</td>
<td valign="top" align="center">NA*</td>
<td valign="top" align="center">NA*</td>
</tr> <tr>
<td valign="top" align="left">Insurance</td>
<td valign="top" align="center">0.182</td>
<td valign="top" align="center">0.000</td>
<td valign="top" align="center">NA**</td>
<td valign="top" align="center">NA**</td>
<td valign="top" align="center">NA**</td>
<td valign="top" align="center">NA**</td>
<td valign="top" align="center">NA**</td>
</tr></tbody>
</table>
<table-wrap-foot>
<p>*When the caliper is &#x02264; 0.5, all of the observations with positive RF are excluded. **When the caliper &#x02264; 1, all of the observations with private insurance are excluded.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s6">
<title>6. Conclusions and discussions</title>
<p>Bayesian approaches to causal inference commonly consider it as a missing data problem. However, as suggested in Ding and Li [<xref ref-type="bibr" rid="B44">44</xref>], the causal inference presents additional challenges that are unique in itself other than the missing data alone. Approaches not carefully address these unique challenges are vulnerable to model misspecifications and could lead to seriously biased results. When not considering the treatment-by-indication confounding, naive Bayesian regression approaches could suffer from &#x0201C;regularity induced bias&#x0201D; [<xref ref-type="bibr" rid="B11">11</xref>]. Because no more than one potential outcome could be observed for a given individual unit, the correlation of <inline-formula><mml:math id="M101"><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is not directly identifiable, leading to &#x0201C;inferential quandary&#x0201D; [<xref ref-type="bibr" rid="B45">45</xref>]. Extensive simulations [<xref ref-type="bibr" rid="B23">23</xref>, <xref ref-type="bibr" rid="B27">27</xref>, <xref ref-type="bibr" rid="B46">46</xref>] suggested poor operational characteristics observed in many widely adopted causal inference methods.</p>
<p>The proposed GPMatch method offers a full Bayesian causal inference approach that can effectively address the unique challenges inherent in causal inference. First, utilizing GP prior covariance function to model the covariance of observed data, GPMatch could estimate the missing potential outcomes much like the matching method. Yet, it avoids the pitfalls of many matching methods. No data is discarded, and no arbitrary caliper is required. Instead, the model allows the data to speak by itself <italic>via</italic> estimating length scale and variance parameters. The SE covariance function of GP prior offers an alternative distance metric, which closely resembles Mahalanobis distance. It matches data points by the degree of matching proportional to the SE distance, without requiring the specification of a caliper. For this reason, the GPMatch could utilize data information better than the matching procedure. Different length scale parameters are considered for different covariates used in defining the SE covariance function. This allows the data to select the most important covariates to be matched on, and acknowledge some variable is more important than others. While the idea of using GP prior to Bayesian causal inference is not new. Utilizing the GP covariance function as a matching device is a unique contribution of this study. The matching utility of the GP covariance function is presented analytically. We presented a heuristic argument suggesting GPMatch possesses doubly robust properties asymptotically. We show that GPMatch estimates the treatment effect by inducing independence between two residuals: the residual from the treatment propensity estimate and the residual from the outcome estimate, much like the G-estimation method. Unlike the two-staged G-estimation, the estimations of the parameters in the covariance function and the mean function for the GPMatch are performed simultaneously. Therefore, the GPMatch regression approach can integrate the benefits of the regression model and matching method and offers a natural way for Bayesian causal inference to address challenges unique to the causal inference problems. The robust and efficient proprieties of GPMatch are well supported by the simulation results designed to reflect the most realistic settings, i.e., no knowledge of matching or functional form of outcome model is available.</p>
<p>The validity of the causal inference by the GPMatch approach rests on three causal assumptions. In particular, we propose SUTVEA as a weak causal assumption than SUTVA. SUTVEA suggests that the potential outcomes and their difference are random variables. It can be considered as a version of the stochastic consistency advocated by Cole and Frangakis [<xref ref-type="bibr" rid="B47">47</xref>] and VanderWeele [<xref ref-type="bibr" rid="B48">48</xref>]. The SUTVEA is proposed to reflect a more realistic setting that the outcome could be measured with error, and the treatment received by different individuals may vary, even though the treatment prescribed is identical. Despite the fact that such treatment variations were raised [<xref ref-type="bibr" rid="B1">1</xref>], no approach to our knowledge has explicitly acknowledged it as such. Rather, most of the methods consider the treatment from the real world as having the exactly same meaning as those from the randomized and strictly controlled experiments. Acknowledging the existence of random error in outcome measures, the GPMatch method is more capable of defending against potential model misspecification in a challenging real-world setting. Like others, the no unmeasured confounder is also required. Because no one has more than one potential outcome observed in the real world, the assumptions remain untestable. However, our SUTVEA implies the correlations among the potential outcomes have an inherent structure, which could be modeled when all confounders are observed. Therefore, potential outcomes from different individuals could be correlated. The correlation is null only when conditional on confounders. This new causal assumption allows for a direct and explicit way of describing the underlying data-generating mechanisms, which may help relieve the &#x0201C;inferential quandary.&#x0201D; By explicitly modeling the mean and covariance functions, the GPMatch can be considered an extension of the widely adopted marginal structural mean model.</p>
<p>The heterogeneous treatment effect (HTE) is ubiquitous. Here, we focused on presenting GPMatch for estimating the average treatment effect. We showed that the GPMatch presented a shrinkage estimate of ATE, where the shrinkage factor is determined by the variance unaccounted by the model and/or unadjusted covariates. When the observed outcome is a perfect representation of a potential outcome, i.e., when <italic>Y</italic><sub><italic>i</italic></sub> &#x0003D; <italic>Y</italic><sub><italic>i</italic></sub>(1)<italic>A</italic><sub><italic>i</italic></sub>&#x0002B;<italic>Y</italic><sub><italic>i</italic></sub>(0)(1&#x02212;<italic>A</italic><sub><italic>i</italic></sub>), the GPMatch estimates ATE as a weighted average of HTEs, where the weight is determined by the propensity of treatment. The HTE strata with an equal chance of receiving either of the treatments receives the maximum possible weight, while the strata with a very small or large probability of receiving one of the treatments will be given near zero weight. This is different from the common approach of ATE, which assigns equal weight to every HTE strata. Rather, it is closely related to the concept of average treatment effect in the overlap (ATO, [<xref ref-type="bibr" rid="B49">49</xref>]). As such, it avoids the lack of overlapping issue, which has plagued many flexible modeling approaches to causal inference. The GPMatch can be readily used for estimating conditional averaged treatment effect (CATE) by including interactions of the pre-specified treatment modifying factors with treatment interaction. When uncertain with the treatment effect modifiers, Sivaganesan et al. [<xref ref-type="bibr" rid="B50">50</xref>] suggested a Bayesian decision theory-based approach for identifying subgroup treatment effects in a randomized trial setting. With GPMatch, the same idea could be applied to identify subgroup treatment by analyzing real-world data. Future studies may consider evaluating GPMatch performances for estimating heterogeneous treatment effects.</p>
<p>The full Bayesian modeling approach is particularly useful in comparative effectiveness research. It offers a coherent and flexible framework for incorporating prior knowledge and synthesizing information from different sources. As a full Bayesian causal inference model, the GPMatch offers a very flexible and general approach to address more complex data types and structures natural to many causal inference problem settings. It can be directly extended to consider multilevel or cluster data structure and to accommodate complex types of treatment such as multiple-level treatments, and continuous or composite types of treatment. The model could be extended to time-varying treatment settings without much difficulty by following the g-formula framework, e.g., Huang et al. [<xref ref-type="bibr" rid="B38">38</xref>, <xref ref-type="bibr" rid="B51">51</xref>].</p>
<p>For the simplicity of exposition, we have considered a relatively simple setting considering binary treatment and a Gaussian outcome. The GPMatch can easily accommodate multi-level treatment, continuous and general types of treatment. The GP regression has been extended to general types of outcomes including binary and count data [<xref ref-type="bibr" rid="B52">52</xref>]. Future studies may further investigate its performance under the general types of treatment, outcome, and data structures. Our simulation focused on comparing with the commonly used causal inference method. Future studies may consider comparisons of our method with other advanced Bayesian methods such as those proposed by Roy et al. [<xref ref-type="bibr" rid="B10">10</xref>] and Saarela et al. [<xref ref-type="bibr" rid="B18">18</xref>], as well as other advanced non-Bayesian approaches such as TMLE [<xref ref-type="bibr" rid="B53">53</xref>]. At last, while our discussion has been focused on the estimation averaged treatment effect (ATE) of the sample, the approach is directly applicable to the estimation of population-averaged treatment effect, averaged treatment effect in treated and in control.</p>
<p>The GP regression is a very flexible modeling technique, but it is computationally expensive. The time cost associated with GP regression increases at <italic>n</italic><sup>3</sup> rate, thus it can be challenging with large N and/or large P. The Bayesian Gibbs Sampling algorithm we have used makes it even more demanding in computational resources. Some literature has offered solutions by applying GP to large N, such as Banerjee et al. [<xref ref-type="bibr" rid="B54">54</xref>]. Alternatively, one may consider using Bayesian Kernel regression as an approximation. Further studies are needed to improve the computational efficiency and to consider variable selection. We presented two dimension reduction strategies: (a) using estimate propensity score; and (b) engaging a variable selection procedure. The simulation studies have shown variable selection strategies can be promising. Alternatively, one may consider strategies specifying priors for length scale parameters. It is well known the length scale parameter is hard to estimate. Researchers derived different kinds of priors for GP, for example, the objective prior in Berger et al. [<xref ref-type="bibr" rid="B55">55</xref>], Kazianka and Pilz [<xref ref-type="bibr" rid="B56">56</xref>], and Ren et al. [<xref ref-type="bibr" rid="B57">57</xref>]. Gelfand et al. [<xref ref-type="bibr" rid="B58">58</xref>] suggested using a uniform prior for the inverse of the scale parameter in spatial analysis, but we found that using a prior with a preference to a smooth surface was more suitable for our purpose. Researchers could also blend their knowledge in the prior to obtain a more efficient estimate. Here we considered the squared exponential covariance function but different covariance functions such as Mat&#x000E9;rn could also be considered. Simple block compound symmetry with one correlation coefficient parameter could be used as an alternative covariance matrix. Such blocked covariance setup could be useful, particularly for a large sample size and where the data has a reasonable clustering structure, such as in the case of a multi-site study. Future study should explore this direction. Last, implementation of the GPMatch for causal inference may not be accessible to most practitioners. for this reason we provided an easy-to-use publicly available <ext-link ext-link-type="uri" xlink:href="https://pcats.research.cchmc.org/">on-line application</ext-link> that allows for user supplied data. Complete step-by-step user&#x00027;s guide and more technical details of this and extended work can be found in our published technical report [<xref ref-type="bibr" rid="B51">51</xref>].</p>
</sec>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>The data analyzed in this study is subject to the following licenses/restrictions: Data are available on reasonable request. Requests to access these datasets should be directed to <email>bin.huang&#x00040;cchmc.org</email>.</p>
</sec>
<sec sec-type="ethics-statement" id="s8">
<title>Ethics statement</title>
<p>The study was approved by the Institutional Review Board at the Cincinnati Children&#x00027;s Hospital Medical Center (IRB &#x00023; 2015-2873). Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the participants was not required to participate in this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec sec-type="author-contributions" id="s9">
<title>Author contributions</title>
<p>Substantial contribution to the conception and design of the work, interpretation of data, and drafting the work or revising it critically for important intellectual content: BH, CC, and SS. Acquisition of data: BH and CC. Analyzes of data: BH, CC, and JL. Agreement to be accountable for the content of the work: BH, CC, JL, and SS. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec sec-type="funding-information" id="s10">
<title>Funding</title>
<p>Research reported in this work was partially funded through a Patient-Centered Outcomes Research Institute (PCORI) Award (ME-1408-19894); Process and Method award from the Center for Clinical and Transnational Science and Training (CCTST), the Center for Clinical and Transnational Science and Training, National Center for Advancing Transnational Sciences (NCATS) of the National Institutes of Health, Award Number 5UL1TR001425-03; the Innovation Fund (IF) from Cincinnati Children&#x00027;s Hospital Medical Center (CCHMC); and the National Institutes of Arthritis and Musculoskeletal Skin Diseases under Award &#x02013; Number P30AR076316.</p>

</sec>
<ack><p>We would like to thank clinicians and parents who care for children with JIA for motivating us to take on the Patient Centered Adaptive Treatment Strategies (PCATS) project. We would like to acknowledge contributions from the members of the PCATS research team for their contributions to ensure the quality of the data and clinical meaningfulness: Michelle Adams, Timothy Beukelman, Hermine I. Brunner, Anne Kocsis, Melanie Kohlheim, Michal Kouril, Jeff Guo, Stephanie Gray, Dan Lovell, Esi M. Morgan, Alivia Neace, Tingting Qiu, Michael Seid, Stacey Woeste, Yin Zhang, Xiaomeng Yue, and Janet Zahner.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>JL was employed by Regeneron Pharmaceutical. Additionally, a patent US20220093271A1 has been filed relating to the research presented. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>

</sec>
<sec sec-type="disclaimer" id="s11">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>

</sec>
<sec sec-type="disclaimer" id="s12">
<title>Author disclaimer</title>
<p>The views in this work are solely the responsibility of the authors and do not necessarily represent the views of the founders, their Board of Governors, or the Methodology Committee.</p>

</sec>
<sec sec-type="supplementary-material" id="s13">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fams.2023.1122114/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fams.2023.1122114/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.PDF" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>

</sec>

<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>Bayesian inference for causal effects: the role of randomization</article-title>. <source>Ann Stat</source>. (<year>1978</year>) <volume>6</volume>:<fpage>34</fpage>&#x02013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1176344064</pub-id><pub-id pub-id-type="pmid">31626644</pub-id></citation></ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hirano</surname> <given-names>K</given-names></name> <name><surname>Imbens</surname> <given-names>GW</given-names></name> <name><surname>Rubin</surname> <given-names>DB</given-names></name> <name><surname>Zhou</surname> <given-names>XH</given-names></name></person-group>. <article-title>Assessing the effect of an influenza vaccine in an encouragement design</article-title>. <source>Biostatistics</source>. (<year>2000</year>) <volume>1</volume>:<fpage>69</fpage>&#x02013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1093/biostatistics/1.1.69</pub-id><pub-id pub-id-type="pmid">12933526</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zajonc</surname> <given-names>T</given-names></name></person-group>. <article-title>Bayesian inference for dynamic treatment regimes: mobility, equity, and efficiency in student tracking</article-title>. <source>J Am Stat Assoc</source>. (<year>2012</year>) <volume>107</volume>:<fpage>80</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.2011.643747</pub-id></citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Imbens</surname> <given-names>GW</given-names></name> <name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>Bayesian inference for causal effects in randomized experiments with noncompliance</article-title>. <source>Ann Stat</source>. (<year>1997</year>) <volume>25</volume>:<fpage>305</fpage>&#x02013;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1034276631</pub-id></citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baccini</surname> <given-names>M</given-names></name> <name><surname>Mattei</surname> <given-names>A</given-names></name> <name><surname>Mealli</surname> <given-names>F</given-names></name></person-group>. <article-title>Bayesian inference for causal mechanisms with application to a randomized study for postoperative pain control</article-title>. <source>Biostatistics</source>. (<year>2017</year>) <volume>18</volume>:<fpage>605</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1093/biostatistics/kxx010</pub-id><pub-id pub-id-type="pmid">28369188</pub-id></citation></ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>L</given-names></name> <name><surname>Zhou</surname> <given-names>N</given-names></name> <name><surname>Zhu</surname> <given-names>L</given-names></name></person-group>. <article-title>Outcome regression-based estimation of conditional average treatment effect</article-title>. <source>Ann Inst Stat Math</source>. (<year>2022</year>) <volume>74</volume>:<fpage>987</fpage>&#x02013;<lpage>1041</lpage>. <pub-id pub-id-type="doi">10.1007/s10463-022-00821-x</pub-id></citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hill</surname> <given-names>JL</given-names></name></person-group>. <article-title>Bayesian nonparametric modeling for causal inference</article-title>. <source>J Comput Graph Stat</source>. (<year>2011</year>) <volume>20</volume>:<fpage>217</fpage>&#x02013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.1198/jcgs.2010.08162</pub-id><pub-id pub-id-type="pmid">33265485</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname> <given-names>J</given-names></name> <name><surname>Lum</surname> <given-names>KJ</given-names></name> <name><surname>Daniels</surname> <given-names>MJ</given-names></name></person-group>. <article-title>A Bayesian nonparametric approach to marginal structural models for point treatments and a continuous or survival outcome</article-title>. <source>Biostatistics</source>. (<year>2016</year>) <volume>18</volume>:<fpage>32</fpage>&#x02013;<lpage>47</lpage>. <pub-id pub-id-type="doi">10.1093/biostatistics/kxw029</pub-id><pub-id pub-id-type="pmid">27345532</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>P</given-names></name> <name><surname>Wahed</surname> <given-names>AS</given-names></name> <name><surname>Thall</surname> <given-names>PF</given-names></name></person-group>. <article-title>Bayesian nonparametric estimation for dynamic treatment regimes with sequential transition times</article-title>. <source>J Am Stat Assoc</source>. (<year>2016</year>) <volume>111</volume>:<fpage>921</fpage>&#x02013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.2015.1086353</pub-id><pub-id pub-id-type="pmid">28018015</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roy</surname> <given-names>J</given-names></name> <name><surname>Lum</surname> <given-names>KJ</given-names></name> <name><surname>Zeldow</surname> <given-names>B</given-names></name> <name><surname>Dworkin</surname> <given-names>JD</given-names></name> <name><surname>Re III</surname> <given-names>VL</given-names></name> <name><surname>Daniels</surname> <given-names>MJ</given-names></name></person-group>. <article-title>Bayesian nonparametric generative models for causal inference with missing at random covariates</article-title>. <source>Biometrics</source>. (<year>2017</year>) <volume>74</volume>:<fpage>1193</fpage>&#x02013;<lpage>202</lpage>. <pub-id pub-id-type="doi">10.1111/biom.12875</pub-id><pub-id pub-id-type="pmid">29579341</pub-id></citation></ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hahn</surname> <given-names>PR</given-names></name> <name><surname>Carvalho</surname> <given-names>CM</given-names></name> <name><surname>Puelz</surname> <given-names>D</given-names></name> <name><surname>He</surname> <given-names>J</given-names></name></person-group>. <article-title>Regularization and confounding in linear regression for treatment effect estimation</article-title>. <source>Bayesian Anal</source>. (<year>2018</year>) <volume>13</volume>:<fpage>163</fpage>&#x02013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1214/16-BA1044</pub-id></citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McCandless</surname> <given-names>LC</given-names></name> <name><surname>Douglas</surname> <given-names>IJ</given-names></name> <name><surname>Evans</surname> <given-names>SJ</given-names></name> <name><surname>Smeeth</surname> <given-names>L</given-names></name></person-group>. <article-title>Cutting feedback in Bayesian regression adjustment for the propensity score</article-title>. <source>Int J Biostat</source>. (<year>2010</year>) <volume>6</volume>:<fpage>1205</fpage>. <pub-id pub-id-type="doi">10.2202/1557-4679.1205</pub-id><pub-id pub-id-type="pmid">21972431</pub-id></citation></ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zigler</surname> <given-names>CM</given-names></name> <name><surname>Watts</surname> <given-names>K</given-names></name> <name><surname>Yeh</surname> <given-names>RW</given-names></name> <name><surname>Wang</surname> <given-names>Y</given-names></name> <name><surname>Coull</surname> <given-names>BA</given-names></name> <name><surname>Dominici</surname> <given-names>F</given-names></name></person-group>. <article-title>Model feedback in Bayesian propensity score estimation</article-title>. <source>Biometrics</source>. (<year>2013</year>) <volume>69</volume>:<fpage>263</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1111/j.1541-0420.2012.01830.x</pub-id><pub-id pub-id-type="pmid">23379793</pub-id></citation></ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hill</surname> <given-names>J</given-names></name> <name><surname>Reiter</surname> <given-names>JP</given-names></name></person-group>. <article-title>Interval estimation for treatment effects using propensity score matching</article-title>. <source>Stat Med</source>. (<year>2006</year>) <volume>25</volume>:<fpage>2230</fpage>&#x02013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.1002/sim.2277</pub-id><pub-id pub-id-type="pmid">16220488</pub-id></citation></ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ho</surname> <given-names>DE</given-names></name> <name><surname>Imai</surname> <given-names>K</given-names></name> <name><surname>King</surname> <given-names>G</given-names></name> <name><surname>Stuart</surname> <given-names>EA</given-names></name></person-group>. <article-title>Matching as nonparametric preprocessing for reducing model dependence in parametric causal inference</article-title>. <source>Polit Anal</source>. (<year>2007</year>) <volume>15</volume>:<fpage>199</fpage>&#x02013;<lpage>236</lpage>. <pub-id pub-id-type="doi">10.1093/pan/mpl013</pub-id></citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>DB</given-names></name> <name><surname>Stuart</surname> <given-names>EA</given-names></name></person-group>. <article-title>Affinely invariant matching methods with discriminant mixtures of proportional ellipsoiddally symmetric distributions</article-title>. <source>Ann Stat</source>. (<year>2006</year>) <volume>34</volume>:<fpage>1814</fpage>&#x02013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1214/009053606000000407</pub-id></citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>DB</given-names></name> <name><surname>Thomas</surname> <given-names>N</given-names></name></person-group>. <article-title>Matching using estimated propensity scores: relating theory to practice</article-title>. <source>Biometrics</source>. (<year>1996</year>) <volume>2</volume>:<fpage>249</fpage>. <pub-id pub-id-type="doi">10.2307/2533160</pub-id><pub-id pub-id-type="pmid">8934595</pub-id></citation></ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saarela</surname> <given-names>O</given-names></name> <name><surname>Belzile</surname> <given-names>LR</given-names></name> <name><surname>Stephens</surname> <given-names>DA</given-names></name></person-group>. <article-title>A Bayesian view of doubly robust causal inference</article-title>. <source>Biometrika</source>. (<year>2016</year>) <volume>3</volume>:<fpage>667</fpage>&#x02013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/asw025</pub-id></citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hahn</surname> <given-names>PR</given-names></name> <name><surname>Murray</surname> <given-names>J</given-names></name> <name><surname>Carvalho</surname> <given-names>CM</given-names></name></person-group>. <article-title>Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects</article-title>. <source>Bayesian Anal</source>. (<year>2017</year>) <volume>15</volume>:<fpage>965</fpage>&#x02013;<lpage>1056</lpage>. <pub-id pub-id-type="doi">10.2139/ssrn.3048177</pub-id></citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stuart</surname> <given-names>EA</given-names></name></person-group>. <article-title>Matching methods for causal inference: a review and a look forward</article-title>. <source>Stat Sci</source>. (<year>2010</year>) <volume>5</volume>:<fpage>1</fpage>&#x02013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1214/09-STS313</pub-id><pub-id pub-id-type="pmid">20871802</pub-id></citation></ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>King</surname> <given-names>G</given-names></name> <name><surname>Nielsen</surname> <given-names>R</given-names></name></person-group>. <article-title>Why propensity scores should not be used for matching</article-title>. <source>Polit Anal</source>. (<year>2019</year>) <volume>27</volume>:<fpage>435</fpage>&#x02013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1017/pan.2019.11</pub-id></citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>The use of matched sampling and regression adjustment to remove bias in observational studies</article-title>. <source>Biometrics</source>. (<year>1973</year>) <volume>29</volume>:<fpage>185</fpage>&#x02013;<lpage>203</lpage>. <pub-id pub-id-type="doi">10.2307/2529685</pub-id><pub-id pub-id-type="pmid">28984050</pub-id></citation></ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gutman</surname> <given-names>R</given-names></name> <name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>Estimation of causal effects of binary treatments in unconfounded studies</article-title>. <source>Stat Methods Med Res</source>. (<year>2017</year>) <volume>26</volume>:<fpage>1199</fpage>&#x02013;<lpage>215</lpage>. <pub-id pub-id-type="doi">10.1177/0962280215570722</pub-id><pub-id pub-id-type="pmid">26013308</pub-id></citation></ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Choi</surname> <given-names>T</given-names></name> <name><surname>Woo</surname> <given-names>Y</given-names></name></person-group>. <article-title>On asymptotic properties of Bayesian partially linear models</article-title>. <source>J Kor Stat Soc</source>. (<year>2013</year>) <volume>42</volume>:<fpage>529</fpage>&#x02013;<lpage>41</lpage>. <pub-id pub-id-type="doi">10.1016/j.jkss.2013.03.003</pub-id><pub-id pub-id-type="pmid">20160947</pub-id></citation></ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Choi</surname> <given-names>T</given-names></name> <name><surname>Schervish</surname> <given-names>MJ</given-names></name></person-group>. <article-title>On posterior consistency in nonparametric regression problems</article-title>. <source>J Mult Anal</source>. (<year>2007</year>) <volume>98</volume>:<fpage>1969</fpage>&#x02013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1016/j.jmva.2007.01.004</pub-id></citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosenbaum</surname> <given-names>PR</given-names></name> <name><surname>Rubin</surname> <given-names>DB</given-names></name></person-group>. <article-title>The central role of the propensity score in observational studies for causal effects</article-title>. <source>Biometrika</source>. (<year>1983</year>) <volume>70</volume>:<fpage>41</fpage>&#x02013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/70.1.41</pub-id></citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kang</surname> <given-names>JDY</given-names></name> <name><surname>Schafer</surname> <given-names>JL</given-names></name></person-group>. <article-title>Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data</article-title>. <source>Stat Sci</source>. (<year>2007</year>) <volume>22</volume>:<fpage>523</fpage>&#x02013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1214/07-STS227</pub-id><pub-id pub-id-type="pmid">18516239</pub-id></citation></ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rasmussen</surname> <given-names>CE</given-names></name> <name><surname>Williams</surname> <given-names>CKI</given-names></name> <name><surname>Sutton</surname> <given-names>RS</given-names></name> <name><surname>Barto</surname> <given-names>AG</given-names></name> <name><surname>Spirtes</surname> <given-names>P</given-names></name> <name><surname>Glymour</surname> <given-names>C</given-names></name> <etal/></person-group>. <source>Gaussian Processes for Machine Learning</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>London: MIT Press MIT Press</publisher-name> (<year>2006</year>).</citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scharfstein</surname> <given-names>DO</given-names></name> <name><surname>Rotnitzky</surname> <given-names>A</given-names></name> <name><surname>Robins</surname> <given-names>JM</given-names></name></person-group>. <article-title>Adjusting for nonignorable drop-out using semiparametric nonresponse models</article-title>. <source>J Am Stat Assoc</source>. (<year>1999</year>) <volume>94</volume>:<fpage>1096</fpage>&#x02013;<lpage>120</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1999.10473862</pub-id></citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bang</surname> <given-names>H</given-names></name> <name><surname>Robins</surname> <given-names>JM</given-names></name></person-group>. <article-title>Doubly robust estimation in missing data and causal inference models</article-title>. <source>Biometrics</source>. (<year>2005</year>) <volume>61</volume>:<fpage>962</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1111/j.1541-0420.2005.00377.x</pub-id><pub-id pub-id-type="pmid">16401269</pub-id></citation></ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chernozhukov</surname> <given-names>V</given-names></name> <name><surname>Chetverikov</surname> <given-names>D</given-names></name> <name><surname>Demirer</surname> <given-names>M</given-names></name> <name><surname>Duflo</surname> <given-names>E</given-names></name> <name><surname>Hansen</surname> <given-names>C</given-names></name> <name><surname>Newey</surname> <given-names>W</given-names></name> <etal/></person-group>. <article-title>Double/debiased machine learning for treatment and structural parameters</article-title>. <source>Econ J</source>. (<year>2018</year>) 21:C1-C68. <pub-id pub-id-type="doi">10.1111/ectj.12097</pub-id></citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schennach</surname> <given-names>sM</given-names></name></person-group>. <article-title>Bayesian exponentially tilted empirical likelihood</article-title>. <source>Biometrika</source>. (<year>2005</year>) <volume>92</volume>:<fpage>31</fpage>&#x02013;<lpage>46</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/92.1.31</pub-id><pub-id pub-id-type="pmid">34992304</pub-id></citation></ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chib</surname> <given-names>S</given-names></name> <name><surname>Shin</surname> <given-names>M</given-names></name> <name><surname>Simoni</surname> <given-names>A</given-names></name></person-group>. <article-title>Bayesian estimation and comparison of conditional moment models</article-title>. <source>arXiv:2110.13531 [math.ST</source>] (<year>2021</year>). <pub-id pub-id-type="doi">10.1111/rssb.12484</pub-id></citation>
</ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Florens</surname> <given-names>JP</given-names></name> <name><surname>Simoni</surname> <given-names>A</given-names></name></person-group>. <article-title>Gaussian processes and bayesian moment estimation</article-title>. <source>J Bus Econ Stat</source>. (<year>2021</year>) <volume>39</volume>:<fpage>482</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1080/07350015.2019.1668799</pub-id></citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>Y</given-names></name> <name><surname>Graham</surname> <given-names>DJ</given-names></name> <name><surname>Mccoy</surname> <given-names>EJ</given-names></name></person-group>. <article-title>Journal of statistical planning and inference semiparametric bayesian doubly robust causal estimation</article-title>. <source>J Stat Plann Inference</source>. (<year>2023</year>) <volume>225</volume>:<fpage>171</fpage>&#x02013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1016/j.jspi.2022.12.005</pub-id></citation>
</ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Robins</surname> <given-names>JM</given-names></name> <name><surname>Hern&#x000E1;n</surname> <given-names>M&#x000C1;</given-names></name> <name><surname>Brumback</surname> <given-names>B</given-names></name></person-group>. <article-title>Marginal structural models and causal inference in epidemiology</article-title>. <source>Epidemiology</source>. (<year>2000</year>) <volume>11</volume>:<fpage>550</fpage>&#x02013;<lpage>60</lpage>. <pub-id pub-id-type="doi">10.1097/00001648-200009000-00011</pub-id><pub-id pub-id-type="pmid">10955408</pub-id></citation></ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vansteelandt</surname> <given-names>S</given-names></name> <name><surname>Joffe</surname> <given-names>M</given-names></name></person-group>. <article-title>Structural nested models and g-estimation: the partially realized promise</article-title>. <source>Stat Sci</source>. (<year>2014</year>) <volume>29</volume>:<fpage>707</fpage>&#x02013;<lpage>31</lpage>. <pub-id pub-id-type="doi">10.1214/14-STS493</pub-id></citation>
</ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>B</given-names></name> <name><surname>Qiu</surname> <given-names>T</given-names></name> <name><surname>Chen</surname> <given-names>C</given-names></name> <name><surname>Zhang</surname> <given-names>Y</given-names></name> <name><surname>Seid</surname> <given-names>M</given-names></name> <name><surname>Lovell</surname> <given-names>D</given-names></name> <etal/></person-group>. <article-title>Timing matters: real-world effectiveness of early combination of biologic and conventional synthetic disease-modifying antirheumatic drugs for treating newly diagnosed polyarticular course juvenile idiopathic arthritis</article-title>. <source>RMD Open.</source> (<year>2020</year>) <volume>6</volume>:<fpage>e001091</fpage>. <pub-id pub-id-type="doi">10.1136/rmdopen-2019-001091</pub-id><pub-id pub-id-type="pmid">32396520</pub-id></citation></ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sekhon</surname> <given-names>JS</given-names></name></person-group>. <article-title>Multivariate and propensity score matching with balance optimization</article-title>. <source>J Stat Software.</source> (<year>2007</year>) <volume>42</volume>:<fpage>1</fpage>&#x02013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v042.i0</pub-id></citation>
</ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Imai</surname> <given-names>K</given-names></name> <name><surname>Ratkovic</surname> <given-names>M</given-names></name></person-group>. <article-title>Covariate balancing propensity score</article-title>. <source>J R Stat Soc B</source>. (<year>2014</year>) <volume>76</volume>:<fpage>243</fpage>&#x02013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.1111/rssb.12027</pub-id></citation>
</ref>
<ref id="B41">
<label>41.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Harrold</surname> <given-names>LR</given-names></name> <name><surname>Salman</surname> <given-names>C</given-names></name> <name><surname>Shoor</surname> <given-names>S</given-names></name> <name><surname>Curtis</surname> <given-names>JR</given-names></name> <name><surname>Asgari</surname> <given-names>MM</given-names></name> <name><surname>Gelfand</surname> <given-names>JM</given-names></name> <etal/></person-group>. <article-title>Incidence and prevalence of juvenile idiopathic arthritis among children in a managed care population, 1996-2009</article-title>. <source>J Rheumatol</source>. (<year>2013</year>) <volume>40</volume>:<fpage>1218</fpage>&#x02013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.3899/jrheum.120661</pub-id><pub-id pub-id-type="pmid">23588938</pub-id></citation></ref>
<ref id="B42">
<label>42.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wallace</surname> <given-names>CA</given-names></name> <name><surname>Ringold</surname> <given-names>S</given-names></name> <name><surname>Bohnsack</surname> <given-names>J</given-names></name> <name><surname>Spalding</surname> <given-names>SJ</given-names></name> <name><surname>Brunner</surname> <given-names>HI</given-names></name> <name><surname>Milojevic</surname> <given-names>D</given-names></name> <etal/></person-group>. <article-title>Extension study of participants from the trial of early aggressive therapy in juvenile idiopathic arthritis</article-title>. <source>J Rheumatol</source>. (<year>2014</year>) <volume>41</volume>:<fpage>2459</fpage>&#x02013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.3899/jrheum.140347</pub-id><pub-id pub-id-type="pmid">25179849</pub-id></citation></ref>
<ref id="B43">
<label>43.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seid</surname> <given-names>M</given-names></name> <name><surname>Huang</surname> <given-names>B</given-names></name> <name><surname>Niehaus</surname> <given-names>S</given-names></name> <name><surname>Brunner</surname> <given-names>HI</given-names></name> <name><surname>Lovell</surname> <given-names>DJ</given-names></name></person-group>. <article-title>Determinants of health-related quality of life in children newly diagnosed with juvenile idiopathic arthritis</article-title>. <source>Arthritis Care Res</source>. (<year>2014</year>) <volume>66</volume>:<fpage>263</fpage>&#x02013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1002/acr.22117</pub-id><pub-id pub-id-type="pmid">23983144</pub-id></citation></ref>
<ref id="B44">
<label>44.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ding</surname> <given-names>P</given-names></name> <name><surname>Li</surname> <given-names>F</given-names></name></person-group>. <article-title>Causal inference: a missing data perspective</article-title>. <source>Stat Sci</source>. (<year>2018</year>) <volume>33</volume>:<fpage>214</fpage>&#x02013;<lpage>37</lpage>. <pub-id pub-id-type="doi">10.1214/18-STS645</pub-id><pub-id pub-id-type="pmid">32628678</pub-id></citation></ref>
<ref id="B45">
<label>45.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dawid</surname> <given-names>AP</given-names></name></person-group>. <article-title>Causal inference without counterfactuals (with discussion)</article-title>. <source>J Am Stat Assoc</source>. (<year>2000</year>) <volume>95</volume>:<fpage>407</fpage>&#x02013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.2000.10474210</pub-id></citation>
</ref>
<ref id="B46">
<label>46.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hahn</surname> <given-names>PR</given-names></name> <name><surname>Dorie</surname> <given-names>V</given-names></name> <name><surname>Murray</surname> <given-names>JS</given-names></name></person-group>. <article-title>Atlantic Causal Inference Conference (ACIC) Data Analysis Challenge 2017</article-title>. <source>arXiv [Preprint]</source>. (<year>2019</year>). arXiv: 1905.09515. <pub-id pub-id-type="doi">10.48550/arXiv.1905.09515</pub-id></citation>
</ref>
<ref id="B47">
<label>47.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cole</surname> <given-names>SR</given-names></name> <name><surname>Frangakis</surname> <given-names>CE</given-names></name></person-group>. <article-title>The consistency statement in causal inference: a definition or an assumption?</article-title> <source>Epidemiology</source>. (<year>2009</year>) <volume>20</volume>:<fpage>3</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1097/EDE.0b013e31818ef366</pub-id><pub-id pub-id-type="pmid">19234395</pub-id></citation></ref>
<ref id="B48">
<label>48.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>VanderWeele</surname> <given-names>TJ</given-names></name></person-group>. <article-title>Concerning the consistency assumption in causal inference</article-title>. <source>Epidemiology</source>. (<year>2009</year>) <volume>20</volume>:<fpage>880</fpage>&#x02013;<lpage>3</lpage>. <pub-id pub-id-type="doi">10.1097/EDE.0b013e3181bd5638</pub-id><pub-id pub-id-type="pmid">19829187</pub-id></citation></ref>
<ref id="B49">
<label>49.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>F</given-names></name> <name><surname>Morgan</surname> <given-names>KL</given-names></name> <name><surname>Zaslavsky</surname> <given-names>AM</given-names></name></person-group>. <article-title>Balancing covariates <italic>via</italic> propensity score weighting</article-title>. <source>J Am Stat Assoc</source>. (<year>2018</year>) <volume>113</volume>:<fpage>390</fpage>&#x02013;<lpage>400</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.2016.1260466</pub-id></citation>
</ref>
<ref id="B50">
<label>50.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sivaganesan</surname> <given-names>S</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>P</given-names></name> <name><surname>Huang</surname> <given-names>B</given-names></name></person-group>. <article-title>Subgroup finding <italic>via</italic> Bayesian additive regression trees</article-title>. <source>Stat Med</source>. (<year>2017</year>) <volume>36</volume>:<fpage>2391</fpage>&#x02013;<lpage>403</lpage>. <pub-id pub-id-type="doi">10.1002/sim.7276</pub-id><pub-id pub-id-type="pmid">28276142</pub-id></citation></ref>
<ref id="B51">
<label>51.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>B</given-names></name> <name><surname>Morgan</surname> <given-names>EM</given-names></name> <name><surname>Chen</surname> <given-names>C</given-names></name> <name><surname>Qiu</surname> <given-names>T</given-names></name> <name><surname>Adams</surname> <given-names>M</given-names></name> <name><surname>Zhang</surname> <given-names>Y</given-names></name> <etal/></person-group>. <source>New Statistical Methods to Compare the Effectiveness of Adaptive Treatment Plans</source>. <publisher-loc>Cincinnati, OH</publisher-loc>: <publisher-name>Cincinnati Children&#x00027;s Hospital Medical Center</publisher-name> (<year>2020</year>).</citation>
</ref>
<ref id="B52">
<label>52.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rasmussen</surname> <given-names>CE</given-names></name></person-group>. <article-title>Gaussian processes in machine learning</article-title>. In: <source>Advanced Lectures on Machine Learning</source>. <publisher-loc>Springer</publisher-loc> (<year>2004</year>). p. <fpage>63</fpage>&#x02013;<lpage>71</lpage>.</citation>
</ref>
<ref id="B53">
<label>53.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van Der Laan</surname> <given-names>MJ</given-names></name> <name><surname>Rubin</surname> <given-names>D</given-names></name></person-group>. <article-title>Targeted maximum likelihood learning</article-title>. <source>Int J Biostat</source>. (<year>2006</year>) <volume>2</volume>:<fpage>1043</fpage>. <pub-id pub-id-type="doi">10.2202/1557-4679.1043</pub-id></citation>
</ref>
<ref id="B54">
<label>54.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Banerjee</surname> <given-names>S</given-names></name> <name><surname>Gelfand</surname> <given-names>AE</given-names></name> <name><surname>Finley</surname> <given-names>AO</given-names></name> <name><surname>Sang</surname> <given-names>H</given-names></name></person-group>. <article-title>Gaussian predictive process models for large spatial data sets</article-title>. <source>J R Stat Soc B Stat Methodol</source>. (<year>2008</year>) <volume>70</volume>:<fpage>825</fpage>&#x02013;<lpage>48</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9868.2008.00663.x</pub-id><pub-id pub-id-type="pmid">19750209</pub-id></citation></ref>
<ref id="B55">
<label>55.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berger</surname> <given-names>JO</given-names></name> <name><surname>De Oliveira</surname> <given-names>V</given-names></name> <name><surname>Sanso</surname> <given-names>B</given-names></name></person-group>. <article-title>Objective Bayesian analysis of spatially correlated data</article-title>. <source>J Am Stat Assoc</source>. (<year>2001</year>) <volume>96</volume>:<fpage>1361</fpage>&#x02013;<lpage>74</lpage>. <pub-id pub-id-type="doi">10.1198/016214501753382282</pub-id></citation>
</ref>
<ref id="B56">
<label>56.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kazianka</surname> <given-names>H</given-names></name> <name><surname>Pilz</surname> <given-names>J</given-names></name></person-group>. <article-title>Objective Bayesian analysis of spatial data with uncertain nugget and range parameters</article-title>. <source>Can J Stat</source>. (<year>2012</year>) <volume>40</volume>:<fpage>304</fpage>&#x02013;<lpage>27</lpage>. <pub-id pub-id-type="doi">10.1002/cjs.11132</pub-id></citation>
</ref>
<ref id="B57">
<label>57.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ren</surname> <given-names>C</given-names></name> <name><surname>Sun</surname> <given-names>D</given-names></name> <name><surname>Sahu</surname> <given-names>SK</given-names></name></person-group>. <article-title>Objective Bayesian analysis of spatial models with separable correlation functions</article-title>. <source>Can J Stat</source>. (<year>2013</year>) <volume>41</volume>:<fpage>488</fpage>&#x02013;<lpage>507</lpage>. <pub-id pub-id-type="doi">10.1002/cjs.11186</pub-id></citation>
</ref>
<ref id="B58">
<label>58.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gelfand</surname> <given-names>AE</given-names></name> <name><surname>Kottas</surname> <given-names>A</given-names></name> <name><surname>MacEachern</surname> <given-names>SN</given-names></name></person-group>. <article-title>Bayesian nonparametric spatial modeling with Dirichlet process mixing</article-title>. <source>J Am Stat Assoc</source>. (<year>2005</year>) <volume>100</volume>:<fpage>1021</fpage>&#x02013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1198/016214504000002078</pub-id></citation>
</ref>
</ref-list> 
</back>
</article> 