<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">589632</article-id>
<article-id pub-id-type="doi">10.3389/frai.2021.589632</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Preventing Failures by Dataset Shift Detection in Safety-Critical Graph Applications</article-title>
<alt-title alt-title-type="left-running-head">Song et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Graph Dataset Shift Detection</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Song</surname>
<given-names>Hoseung</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1047578/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Thiagarajan</surname>
<given-names>Jayaraman J.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1079772/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kailkhura</surname>
<given-names>Bhavya</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/845367/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Department of Statistics, University of California, <addr-line>Davis</addr-line>, <addr-line>CA</addr-line>, <country>United&#x20;States</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Lawrence Livermore National Laboratory, <addr-line>Livermore</addr-line>, <addr-line>CA</addr-line>, <country>United&#x20;States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/148909/overview">Novi Quadrianto</ext-link>, University of Sussex, United&#x20;Kingdom</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/570108/overview">Bowei Chen</ext-link>, University of Glasgow, United&#x20;Kingdom</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/629168/overview">Chetan Tonde</ext-link>, Amazon, United&#x20;States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Hoseung Song, <email>hosong@ucdavis.edu</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Machine Learning and&#x20;Artificial&#x20;Intelligence, a section of the journal Frontiers in Artificial Intelligence</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>18</day>
<month>05</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>4</volume>
<elocation-id>589632</elocation-id>
<history>
<date date-type="received">
<day>31</day>
<month>07</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>26</day>
<month>04</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Song, Thiagarajan and Kailkhura.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Song, Thiagarajan and Kailkhura</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Dataset shift refers to the problem where the input data distribution may change over time (e.g., between training and test stages). Since this can be a critical bottleneck in several safety-critical applications such as healthcare, drug-discovery, etc., dataset shift detection has become an important research issue in machine learning. Though several existing efforts have focused on image/video data, applications with graph-structured data have not received sufficient attention. Therefore, in this paper, we investigate the problem of detecting shifts in graph structured data through the lens of statistical hypothesis testing. Specifically, we propose a practical two-sample test based approach for shift detection in large-scale graph structured data. Our approach is very flexible in that it is suitable for both undirected and directed graphs, and eliminates the need for equal sample sizes. Using empirical studies, we demonstrate the effectiveness of the proposed test in detecting dataset shifts. We also corroborate these findings using real-world datasets, characterized by directed graphs and a large number of&#x20;nodes.</p>
</abstract>
<kwd-group>
<kwd>graph learning</kwd>
<kwd>dataset shift</kwd>
<kwd>safety</kwd>
<kwd>two-sample testing</kwd>
<kwd>random graph models</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Most machine learning (ML) applications, e.g., healthcare, drug-discovery, etc., encounter dataset shift when operating in the real-world. The reason for this comes from the bias in the testing conditions compared to the training environment introduced by experimental design. It is well known that ML systems are highly susceptible to such dataset shifts, which often leads to unintended and potentially harmful behavior. For example, in ML-based electronic health record systems, input data is often characterized by shifting demographics, where clinical and operational practices evolve over time and a wrong prediction can threaten human safety.</p>
<p>Although dataset shift is a frequent cause of failure of ML systems, very few ML systems inspect incoming data for a potential distribution shift (<xref ref-type="bibr" rid="B3">Bulusu et&#x20;al., 2020</xref>). While some practical methods such as (<xref ref-type="bibr" rid="B25">Rabanser et&#x20;al., 2019</xref>) have been proposed for detecting shifts in applications with Euclidean structured data (speech, images, or video), there are limited efforts in solving such issues for graph structured data that naturally arises in several scientific and engineering applications. In recent years there has been a surge of interest in applying ML techniques to structured data, e.g. graphs, trees, manifolds etc. In particular, graph structured data is becoming prevalent in several high-impact applications including bioinformatics, neuroscience, healthcare, molecular chemistry and computer graphics. In this paper, we investigate the problem of detecting distribution shifts in graph-structured datasets for responsible deployment of ML in safety-critical applications. Specifically, we propose to solve the problem of detecting shifts in graph-structured data through the lens of statistical two-sample testing. Broadly, the objective in two-sample testing for graphs is to test whether two populations of random graphs are different or not based on the samples generated from each of&#x20;them.</p>
<p>Two-sample testing has been of significant research interest due to its broad applicability. An important class of testing methods relies on summary metrics that quantify the topological differences between networks. For example, in brain network analysis, commonly adopted topological summary metrics include the global efficiency (<xref ref-type="bibr" rid="B13">Ginestet et&#x20;al., 2011</xref>) and network modularity (<xref ref-type="bibr" rid="B11">Ginestet et&#x20;al., 2014</xref>). An inherent challenge with these approaches is that the topological characteristics depend directly on the number of edges in the graph, and can be insufficient in practice. An alternative class of methods is based on comparing the structure of subgraphs to produce a similarity score (<xref ref-type="bibr" rid="B27">Shervashidze et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B19">Macindoe and Richards, 2010</xref>). For example, <xref ref-type="bibr" rid="B27">Shervashidze et&#x20;al. (2009)</xref> used the earth mover&#x2019;s distance between the distributions of feature summaries of their constituent subgraphs.</p>
<p>While these heuristic methods are reasonably effective for comparing real-world graphs, not until recently that a principled analysis of hypothesis testing with random graphs was carried out. In this spirit, <xref ref-type="bibr" rid="B12">Ginestet et&#x20;al. (2017)</xref> developed a test statistic based on a precise geometric characterization of the space of graph Laplacian matrices. Most of these approaches for graph testing based on classical two-sample tests are only applicable to the restrictive low-dimensional setting, where the population size (number of graphs) is larger than the size of the graphs (number of vertices). To overcome this challenge, <xref ref-type="bibr" rid="B28">Tang et&#x20;al. (2017a)</xref> proposed a semi-parametric two-sample test for a class of latent position random graphs, and studied the problem of testing whether two dot product random graphs are drawn from the same population or not. Other testing approaches that focused on hypothesis testing for specific scenarios, such as sparse networks (<xref ref-type="bibr" rid="B9">Ghoshdastidar et&#x20;al., 2017a</xref>) and networks with a large number of nodes (<xref ref-type="bibr" rid="B10">Ghoshdastidar et&#x20;al., 2017b</xref>), have been developed. More recently, <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref> developed a novel testing framework for random graphs, particularly for the cases with small sample sizes and the large number of nodes, and studied its optimality. More specifically, this test statistic was based on the asymptotic null distributions under certain model assumptions.</p>
<p>Unfortunately, all these approaches are limited to testing undirected graphs under the equal sample size (for two graph populations) setting. In real-world dataset shift detection problems, these assumptions are extremely restrictive, making existing approaches inapplicable to several applications. In order to circumvent these crucial shortcomings, we develop a novel approach based on hypothesis testing for detecting shifts in graph-structured data, which is more flexible (i.e.,&#x20;accommodates 1) both undirected and directed graphs and 2) unequal sample size cases). Moreover, it is highly effective even when the sample size grows. Notice that, similar to the setting in <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref>, we also consider&#x20;scenarios where all networks are defined from the same vertex set, which is common to several real-world applications. The main contributions of this paper are summarized below:<list list-type="simple">
<list-item>
<p>&#x2022; We propose a new test statistic that can be applied to undirected graphs as well as directed graphs and/or unweighted graphs as well as weighted graphs, while eliminating the equal sample size requirement. The asymptotic distribution for the proposed statistic, based on the well-known U-statistic, is derived.</p>
</list-item>
<list-item>
<p>&#x2022; A practical permutation approach based on a simplified form of the statistic is also proposed.</p>
</list-item>
<list-item>
<p>&#x2022; We compare the new approach with existing methods for graph testing in diverse simulation settings, and show that the proposed statistic is more flexible and achieves significant performance improvements.</p>
</list-item>
<list-item>
<p>&#x2022; In order to demonstrate the usefulness of the proposed method in challenging real-world problems, we consider several applications (including a healthcare application), and show the effectiveness of our approach.</p>
</list-item>
</list>
</p>
</sec>
<sec id="s2">
<title>2 Preliminaries</title>
<p>We consider the following two-sample setting. Let two random graph populations with <italic>d</italic> vertices be denoted as <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> from <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> from <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:mi>Q</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> with their adjacency matrices <inline-formula id="inf5">
<mml:math id="m5">
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, respectively. We are concerned with testing hypotheses:<disp-formula id="e1">
<mml:math id="m7">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>Q</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#xa0;vs</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>:</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>Q</mml:mi>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>Notice that we consider the cases where each population consists of independent and identically distributed samples, which encompasses a wide-range of network analysis problems, see, e.g., <xref ref-type="bibr" rid="B15">Holland et&#x20;al. (1983)</xref>, <xref ref-type="bibr" rid="B21">Newman and Girvan (2004)</xref>, <xref ref-type="bibr" rid="B22">Newman (2006)</xref>. In contrast to existing formulations, e.g., <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref>, we consider a more flexible setup where 1) the sample sizes <italic>m</italic> and <italic>n</italic> are allowed to be different and 2) the graphs in <italic>p</italic> and <italic>Q</italic> can be weighted and/or directed.</p>
<p>While there have several efforts to two-sample testing of graphs (<xref ref-type="bibr" rid="B2">Bubeck et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B7">Gao and Lafferty, 2017</xref>; <xref ref-type="bibr" rid="B20">Maugis et&#x20;al., 2017</xref>), recent works such as <xref ref-type="bibr" rid="B28">Tang et&#x20;al. (2017a)</xref>, <xref ref-type="bibr" rid="B29">Tang et&#x20;al. (2017b)</xref>; <xref ref-type="bibr" rid="B12">Ginestet et&#x20;al. (2017)</xref> have focused on designing more general testing methods that are applicable to practical settings. For example, <xref ref-type="bibr" rid="B12">Ginestet et&#x20;al. (2017)</xref> proposed a practical test statistic based on the correspondence between an undirected graph and its Laplacian under the inhomogeneous Erd&#x151;s-R&#xe9;nyi (IER) assumption, which means all nodes are independently generated from a Bernoulli distribution (see details in <xref ref-type="sec" rid="s3">Section 3</xref>). The test statistic, under the assumption of equal sample sizes <italic>m</italic>, can be described as follows:<disp-formula id="e2">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>A</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>B</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mi>a</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>where<disp-formula id="equ1">
<mml:math id="m9">
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>A</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>B</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ2">
<mml:math id="m10">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>A</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>m</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>B</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>m</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The authors showed that <inline-formula id="inf7">
<mml:math id="m11">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> converges to a chi-square distribution as <inline-formula id="inf8">
<mml:math id="m12">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2192;</mml:mo>
<mml:mi>&#x221e;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> under <inline-formula id="inf9">
<mml:math id="m13">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. However, this statistic can be interpreted as Hotelling&#x2019;s <inline-formula id="inf10">
<mml:math id="m14">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> statistic for multivariate data, thus leading to no performance guarantees for &#x201c;small <italic>m</italic> and large <italic>d</italic>&#x201d; scenario. This is because the variance estimates used in <xref ref-type="disp-formula" rid="e2">Eq. 2</xref> are not stable for small <italic>m</italic> and large <italic>d</italic>, especially when graphs are sparse.</p>
<p>Recently, <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref> proposed a new class of test statistics, designed for different scenarios under the IER model assumption. More specifically, they focused on cases with small <italic>m</italic> and large <italic>d</italic>. For cases with <inline-formula id="inf11">
<mml:math id="m15">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, the following test statistic was used:<disp-formula id="e3">
<mml:math id="m16">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>While it was suggested by the authors to perform this test using bootstraps from the aggregated data, this could be challenging for sparse graphs, since it is difficult to construct bootstrapped statistics from an operator norm. Hence, they considered an alternate test statistic based on the Frobenius-norm as follows:<disp-formula id="e4">
<mml:math id="m17">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x394;</mml:mtext>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x394;</mml:mtext>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>where <inline-formula id="inf12">
<mml:math id="m18">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x394;</mml:mtext>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf13">
<mml:math id="m19">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. It was shown that this test is provably effective and more reliable. Furthermore, they derived the asymptotic normality of <inline-formula id="inf14">
<mml:math id="m20">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> as <inline-formula id="inf15">
<mml:math id="m21">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x2192;</mml:mo>
<mml:mi>&#x221e;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> to make the method instantly applicable without the bootstrap procedure. Despite the good properties of this method, this test can be used only when the two sample sizes are equal, and when graphs are undirected. In the rest of this paper, we develop a new test statistic which addresses these two crucial limitations.</p>
</sec>
<sec id="s3">
<title>3 Proposed Test</title>
<p>To carry out two-sample testing, we want to measure the distance between two populations. Here, we utilize the Frobenius distance as the evidence for discrepancy between two populations:<disp-formula id="e5">
<mml:math id="m22">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x2016;</mml:mo>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>Q</mml:mi>
</mml:mrow>
<mml:mo>&#x2016;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>
</p>
<p>Next, we provide finite sample estimates of this quantity. To accommodate more general settings for random graphs, the new test statistic is defined as follows:<disp-formula id="e6">
<mml:math id="m23">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>where<disp-formula id="equ3">
<mml:math id="m24">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Note that the proposed test statistic accommodates scenarios where <inline-formula id="inf16">
<mml:math id="m25">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> the sample sizes <italic>m</italic> and <italic>n</italic> are different and <inline-formula id="inf17">
<mml:math id="m26">
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> the graphs in <italic>p</italic> and <italic>Q</italic> are weighted and/or directed.</p>
<p>Next, we analyze the theoretical properties of the proposed test. For the ease of theoretical analysis, we focus on the case where graphs are unweighted and undirected. However, the proposed test and algorithmic tools are applicable to weighted and/or directed graph scenarios which is the main focus of the paper and is considered in our experimental evaluations. More specifically, in our theoretical analysis, we assume that graphs are drawn from the inhomogeneous Erd&#x151;s-R&#xe9;nyi (IER) random graph process, which is considered as an extended version of the Erd&#x151;s-R&#xe9;nyi (ER) model from <xref ref-type="bibr" rid="B1">Bollob&#xe1;s et&#x20;al. (2007)</xref>. In other words, we consider unweighted and undirected random graphs, where edges occur independently without any additional structural assumption on the population adjacency matrix. Note, the IER model encompasses other models studied in the literature including random dot product graphs (<xref ref-type="bibr" rid="B29">Tang et&#x20;al., 2017b</xref>) and stochastic block models (<xref ref-type="bibr" rid="B18">Lei et&#x20;al., 2016</xref>). A graph <inline-formula id="inf18">
<mml:math id="m27">
<mml:mrow>
<mml:mi mathvariant="script">G</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> from a population symmetric adjacency <italic>p</italic> with zero diagonal is considered to be an IER graph if <inline-formula id="inf19">
<mml:math id="m28">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi mathvariant="normal">G</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mo>
</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mi>B</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> for all <inline-formula id="inf20">
<mml:math id="m29">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. Here, <italic>d</italic> denotes the cardinality of the vertex set. Next we analyze the theoretical properties of the proposed test under IER assumption.</p>
<p>LEMMA 3.1. <inline-formula id="inf21">
<mml:math id="m30">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is an unbiased empirical estimate of T, that is,<disp-formula id="e7">
<mml:math id="m31">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
</p>
<p>PROOF. Under the IER assumptions, for all <inline-formula id="inf22">
<mml:math id="m32">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, we have<disp-formula id="equ4">
<mml:math id="m33">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x223c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ5">
<mml:math id="m34">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#x223c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ6">
<mml:math id="m35">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x223c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="equ7">
<mml:math id="m36">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:munder>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mover>
</mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x223c;</mml:mo>
<mml:mi>B</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>since <inline-formula id="inf23">
<mml:math id="m37">
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf24">
<mml:math id="m38">
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are mutually independent <inline-formula id="inf25">
<mml:math id="m39">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi>l</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. Then,<disp-formula id="equ8">
<mml:math id="m40">
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>T</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msubsup>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>m</mml:mi>
<mml:mi>n</mml:mi>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>Q</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2225;</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>Q</mml:mi>
<mml:msubsup>
<mml:mo>&#x2225;</mml:mo>
<mml:mi>F</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
</p>
<p>In the form of <inline-formula id="inf26">
<mml:math id="m41">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the first term and the second term represent a similarity (closeness) within two samples, and the last term represents similarity between two samples. Hence, a relatively large value of <inline-formula id="inf27">
<mml:math id="m42">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the evidence against the null hypothesis. Note that the proposed statistic does not require equal sample sizes and undirected graphs assumptions.</p>
<p>When <inline-formula id="inf28">
<mml:math id="m43">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, we have a simpler form of the estimate. Let&#x20;<inline-formula id="inf29">
<mml:math id="m44">
<mml:mrow>
<mml:mi>Z</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> be <inline-formula id="inf30">
<mml:math id="m45">
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> random variables <inline-formula id="inf31">
<mml:math id="m46">
<mml:mrow>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x223c;</mml:mo>
<mml:mi>P</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>Q</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. Then,<disp-formula id="e8">
<mml:math id="m47">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(8)</label>
</disp-formula>where<disp-formula id="e9">
<mml:math id="m48">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>u</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>u</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>and <inline-formula id="inf32">
<mml:math id="m49">
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>z</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Since the proposed estimate has a form of <italic>U</italic>-statistics, which provides a minimum-variance unbiased estimator for <italic>T</italic> (<xref ref-type="bibr" rid="B14">Hoeffding, 1992</xref>; <xref ref-type="bibr" rid="B26">Serfling, 2009</xref>), the asymptotic distribution of <inline-formula id="inf33">
<mml:math id="m50">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> can be derived based on the asymptotic results of <italic>U</italic>-statistics.</p>
<p>Theorem 3.1 Assume <inline-formula id="inf34">
<mml:math id="m51">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mi>&#x221e;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. Under <inline-formula id="inf35">
<mml:math id="m52">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, we have<disp-formula id="e10">
<mml:math id="m53">
<mml:mrow>
<mml:msqrt>
<mml:mi>m</mml:mi>
</mml:msqrt>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mo>&#x2192;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>d</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:msup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>where <inline-formula id="inf36">
<mml:math id="m54">
<mml:mrow>
<mml:msup>
<mml:mi>&#x3c3;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>var</mml:mi>
</mml:mrow>
<mml:mi>z</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mi>z</mml:mi>
<mml:mo>&#x27;</mml:mo>
</mml:mrow>
</mml:msub>
<mml:mi>h</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>z</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>z</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. Under <inline-formula id="inf37">
<mml:math id="m55">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the U-statistic is degenerate and<disp-formula id="e11">
<mml:math id="m56">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mo>&#x2192;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mover>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>&#x221e;</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mi>d</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>&#x3be;</mml:mi>
<mml:mi>u</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>where <inline-formula id="inf38">
<mml:math id="m57">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3be;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mo>
</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>.</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:mover>
</mml:mrow>
<mml:mi>N</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>0,1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf39">
<mml:math id="m58">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the solutions of<disp-formula id="e12">
<mml:math id="m59">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:msub>
<mml:mi>&#x3d5;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>z</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:munder>
<mml:mo>&#x222b;</mml:mo>
<mml:msup>
<mml:mi>z</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:munder>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>z</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mi>z</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>&#x3d5;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>z</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>d</mml:mi>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msup>
<mml:mi>z</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(12)</label>
</disp-formula>
</p>
<p>PROOF. These results can be obtained by applying the asymptotic properties of <italic>U</italic>-statistics as given in <xref ref-type="bibr" rid="B26">Serfling (2009)</xref> and the IER assumptions.</p>
<p>Having devised the test statistic, our next aim is to determine whether the new test statistic <inline-formula id="inf40">
<mml:math id="m60">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is large enough to be outside the <inline-formula id="inf41">
<mml:math id="m61">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> quantile of the limiting null distribution in <xref ref-type="disp-formula" rid="e11">Eq. 11</xref>, where <italic>a</italic> is the significance level of the test. One difficulty in implementing this test is that the asymptotic null distribution 11) and its <italic>a</italic> quantile do not have an analytic form unless <inline-formula id="inf42">
<mml:math id="m62">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bb;</mml:mi>
<mml:mi>u</mml:mi>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> or 1. Therefore, in order to estimate this quantile, we propose a permutation approach on the aggregated data. The main advantage of this method is that it yields a valid level <italic>a</italic> test in finite-sample scenarios (<xref ref-type="bibr" rid="B17">Lehmann and Romano, 2006</xref>). To this end, we first consider a simpler form of the test statistic (based on <inline-formula id="inf43">
<mml:math id="m63">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>) defined as follows:<disp-formula id="e13">
<mml:math id="m64">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>d</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:mstyle>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
<label>(13)</label>
</disp-formula>where<disp-formula id="e14">
<mml:math id="m65">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>A</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>B</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mstyle>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
<label>(14)</label>
</disp-formula>
</p>
<p>Although we do not use the last term of <inline-formula id="inf44">
<mml:math id="m66">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> in the definition of <inline-formula id="inf45">
<mml:math id="m67">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the performance of the test statistic <inline-formula id="inf46">
<mml:math id="m68">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> achieved by incorporating similarities in two samples is still maintained in the permutation framework. The permutation test is summarized in <xref ref-type="statement" rid="alg1">
<bold>Algorithm 1</bold>
</xref>; its computational cost is <inline-formula id="inf47">
<mml:math id="m69">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2228;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf48">
<mml:math id="m70">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2228;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> indicates the maximum among <italic>m</italic> and&#x20;<italic>n</italic>.</p>
<p>
<statement content-type="algorithm" id="alg1">
<label>Algorithm 1</label>
<p>Permutation test using <inline-formula id="inf49">
<mml:math id="m71">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>.<list list-type="simple">
<list-item>
<p>
<bold>Input:</bold> Graph samples <inline-formula id="inf50">
<mml:math id="m72">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf51">
<mml:math id="m73">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>; Significance level &#x3b1;; Number of permutation&#x20;<italic>R</italic>.</p>
</list-item>
<list-item>
<p>
<bold>Output:</bold> Reject the null hypothesis <inline-formula id="inf52">
<mml:math id="m74">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> if <italic>p</italic>-value <inline-formula id="inf53">
<mml:math id="m75">
<mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
</list-item>
<list-item>
<p>1: Compute <inline-formula id="inf54">
<mml:math id="m76">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:msub>
<mml:mi mathvariant="italic">&#x27;</mml:mi>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> by <xref ref-type="disp-formula" rid="e13">Eqs. 13</xref>,&#x20;<xref ref-type="disp-formula" rid="e14">14</xref>.</p>
</list-item>
<list-item>
<p>2: <bold>for</bold> <inline-formula id="inf55">
<mml:math id="m77">
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> to <italic>R</italic>&#x20;<bold>do</bold>
</p>
</list-item>
<list-item>
<p>3: Randomly permute the pooled samples <inline-formula id="inf56">
<mml:math id="m78">
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="script">A</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mtext>&#x2009;</mml:mtext>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi mathvariant="normal">&#x212c;</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and divide into two groups with sample sizes m and&#x20;n.</p>
</list-item>
<list-item>
<p>4: Compute <inline-formula id="inf57">
<mml:math id="m79">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mi>r</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> which is <inline-formula id="inf58">
<mml:math id="m80">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> (as given in <xref ref-type="disp-formula" rid="e13">Eqs. 13</xref>, <xref ref-type="disp-formula" rid="e14">14</xref> calculated using permuted samples.</p>
</list-item>
<list-item>
<p>5: <bold>end&#x20;for</bold>.</p>
</list-item>
<list-item>
<p>6: Calculate <italic>p</italic>-value &#x3d; <inline-formula id="inf59">
<mml:math id="m81">
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mi>r</mml:mi>
</mml:msub>
<mml:mo>&#x2265;</mml:mo>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
</p>
</list-item>
</list>
</p>
<p>Unlike <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref> where the test is reliable even for a small number of samples, due to its asymptotic distribution, our test procedure needs a reasonable number of samples to implement the permutation test. Based on simulations, we see that as low as four samples are sufficient to obtain reliable results.</p>
</statement>
</p>
</sec>
<sec id="s4">
<title>4 Experiments</title>
<p>Here, we first examine the performance of the new test statistics under diverse settings through simulation studies. Later, we will apply the new test to real-world applications.</p>
<sec id="s4-1">
<title>4.1 Simulated Data</title>
<p>To evaluate the performance of the new test, we examine sparse graphs from stochastic block models with two communities as studied in <xref ref-type="bibr" rid="B28">Tang et&#x20;al. (2017a)</xref> an <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref>. Specifically, we consider sparse graphs with <italic>d</italic> nodes where the same <inline-formula id="inf60">
<mml:math id="m82">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> size community is constructed with an edge probability <italic>p</italic> and <inline-formula id="inf61">
<mml:math id="m83">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> size different community with an edge probability <italic>q</italic>. In other words, we define <italic>p</italic> and <italic>Q</italic> as follows:<disp-formula id="equ9">
<mml:math id="m84">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mi>p</mml:mi>
</mml:mtd>
<mml:mtd>
<mml:mi>q</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>q</mml:mi>
</mml:mtd>
<mml:mtd>
<mml:mi>p</mml:mi>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>v</mml:mi>
<mml:mi>s</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>&#xa0;&#xa0;</mml:mtext>
<mml:mi>Q</mml:mi>
<mml:mo>:</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:mrow>
</mml:mtd>
<mml:mtd>
<mml:mi>q</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mi>q</mml:mi>
</mml:mtd>
<mml:mtd>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>.</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>We generate <italic>m</italic> samples from <italic>p</italic> and <italic>n</italic> samples from <italic>Q</italic>. Under the null, <inline-formula id="inf62">
<mml:math id="m85">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, implying <inline-formula id="inf63">
<mml:math id="m86">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>Q</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, whereas <inline-formula id="inf64">
<mml:math id="m87">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> under <inline-formula id="inf65">
<mml:math id="m88">
<mml:mrow>
<mml:msub>
<mml:mi>H</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, implying <inline-formula id="inf66">
<mml:math id="m89">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>Q</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. Following <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref>, we set <inline-formula id="inf67">
<mml:math id="m90">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.1</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf68">
<mml:math id="m91">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.05</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf69">
<mml:math id="m92">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> for null, whereas <inline-formula id="inf70">
<mml:math id="m93">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> for the alternative hypothesis. We examine the performance of the new test for different choices of <inline-formula id="inf71">
<mml:math id="m94">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mn>100,200,300,400,500</mml:mn>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>.</p>
<p>The performance of the test based on <inline-formula id="inf72">
<mml:math id="m95">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is studied and compared to existing methods. <inline-formula id="inf73">
<mml:math id="m96">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> in <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref> is the bootstrap test based on <inline-formula id="inf74">
<mml:math id="m97">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf75">
<mml:math id="m98">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> denotes the normal dominance test based on the asymptotic distribution of <inline-formula id="inf76">
<mml:math id="m99">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> (also from <xref ref-type="bibr" rid="B8">Ghoshdastidar and von Luxburg (2018)</xref>). We denote the new test which is the permutation test based on <inline-formula id="inf77">
<mml:math id="m100">
<mml:mrow>
<mml:msub>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> as <inline-formula id="inf78">
<mml:math id="m101">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. The estimated power is calculated as the number of null rejections at <inline-formula id="inf79">
<mml:math id="m102">
<mml:mrow>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.05</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> level out of 100 independent trials for each of these methods. For <inline-formula id="inf80">
<mml:math id="m103">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf81">
<mml:math id="m104">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, <italic>p</italic>-values are determined by 1,000 permutation runs to have a reliable comparison.</p>
<p>
<xref ref-type="fig" rid="F1">Figure&#x20;1</xref> shows results for the undirected graph case under different settings. When two sample sizes are equal (upper panels), where existing methods can be applied, we see that the proposed test outperforms all other methods. Note that, when the sample size of two graph populations are different (i.e.,&#x20;<inline-formula id="inf82">
<mml:math id="m105">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>), the existing methods cannot be applied. We see that the proposed test still performs well under sample imbalance and the large <italic>d</italic> regime.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Performance comparison of different tests for undirected graphs.</p>
</caption>
<graphic xlink:href="frai-04-589632-g001.tif"/>
</fig>
<p>We also evaluate the performance of the new test for directed graphs under various configurations. (<xref ref-type="fig" rid="F2">Figure&#x20;2</xref>). The existing methods are not applicable to directed graphs, but we transform <inline-formula id="inf83">
<mml:math id="m106">
<mml:mrow>
<mml:msub>
<mml:mi>T</mml:mi>
<mml:mrow>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> so that it can be applied to directed graphs. The results show that the new test also has better power than the existing method in two-sample testing for directed graph and works well for large graphs.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Performance comparison of proposed test for directed graphs.</p>
</caption>
<graphic xlink:href="frai-04-589632-g002.tif"/>
</fig>
<p>Next, we examine the effect of the sparsity on the performance of the tests. To this end, we consider the same setting as above, but with different choices of <inline-formula id="inf84">
<mml:math id="m107">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mn>0.02</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.03</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> for each of methods. Small <inline-formula id="inf85">
<mml:math id="m108">
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:math>
</inline-formula> implies that there is small difference between <italic>p</italic> and <italic>Q</italic>, making the tests more difficult to detect discrepancy between two samples. <xref ref-type="table" rid="T1">Table&#x20;1</xref> shows results for undirected graphs with variations in the sparsity level <inline-formula id="inf86">
<mml:math id="m109">
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:math>
</inline-formula>. We see that, in general, the proposed method is consistently superior to existing methods. This indicates that our test statistic is more effective in detecting the inhomogeneity between two samples than the existing methods. The effect of a sparsity level <inline-formula id="inf87">
<mml:math id="m110">
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:math>
</inline-formula> on the performance of the proposed test for directed graphs can be found in <xref ref-type="table" rid="T2">Table&#x20;2</xref>. We see that the proposed test also performs better than the existing method for directed graph settings, and as expected, the power increases as <inline-formula id="inf88">
<mml:math id="m111">
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
</mml:math>
</inline-formula> or the number of samples increases.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Power comparison of different tests for undirected graphs with varying sparsity levels.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">
<inline-formula id="inf89">
<mml:math id="m112">
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="3" align="center">
<inline-formula id="inf90">
<mml:math id="m113">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.02</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="3" align="center">
<inline-formula id="inf91">
<mml:math id="m114">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.03</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="3" align="center">
<inline-formula id="inf92">
<mml:math id="m115">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
<tr>
<td align="left">
<italic>D</italic>
</td>
<td align="center">
<inline-formula id="inf93">
<mml:math id="m116">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf94">
<mml:math id="m117">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf95">
<mml:math id="m118">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf96">
<mml:math id="m119">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf97">
<mml:math id="m120">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf98">
<mml:math id="m121">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf99">
<mml:math id="m122">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf100">
<mml:math id="m123">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf101">
<mml:math id="m124">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">100</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">
<bold>0.10</bold>
</td>
<td align="char" char=".">
<bold>0.10</bold>
</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">
<bold>0.17</bold>
</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">
<bold>0.17</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">
<bold>0.09</bold>
</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.07</td>
<td align="char" char=".">
<bold>0.18</bold>
</td>
<td align="char" char=".">0.10</td>
<td align="char" char=".">
<bold>0.18</bold>
</td>
<td align="char" char=".">
<bold>0.39</bold>
</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">
<bold>0.39</bold>
</td>
</tr>
<tr>
<td align="left">300</td>
<td align="char" char=".">
<bold>0.17</bold>
</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">
<bold>0.17</bold>
</td>
<td align="char" char=".">0.34</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">
<bold>0.37</bold>
</td>
<td align="char" char=".">0.50</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">
<bold>0.66</bold>
</td>
</tr>
<tr>
<td align="left">400</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">
<bold>0.15</bold>
</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.26</td>
<td align="char" char=".">
<bold>0.53</bold>
</td>
<td align="char" char=".">0.78</td>
<td align="char" char=".">0.71</td>
<td align="char" char=".">
<bold>0.90</bold>
</td>
</tr>
<tr>
<td align="left">500</td>
<td align="char" char=".">
<bold>0.22</bold>
</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">
<bold>0.22</bold>
</td>
<td align="char" char=".">0.63</td>
<td align="char" char=".">0.48</td>
<td align="char" char=".">
<bold>0.75</bold>
</td>
<td align="char" char=".">0.91</td>
<td align="char" char=".">0.89</td>
<td align="char" char=".">
<bold>0.98</bold>
</td>
</tr>
</tbody>
</table>
<table>
<thead>
<tr>
<td align="left">
<inline-formula id="inf102">
<mml:math id="m125">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="3" align="center">
<inline-formula id="inf103">
<mml:math id="m126">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.02</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="3" align="center">
<inline-formula id="inf104">
<mml:math id="m127">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.03</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="3" align="center">
<inline-formula id="inf105">
<mml:math id="m128">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
<tr>
<td align="left">
<italic>d</italic>
</td>
<td align="center">
<inline-formula id="inf106">
<mml:math id="m129">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf107">
<mml:math id="m130">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf108">
<mml:math id="m131">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf109">
<mml:math id="m132">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf110">
<mml:math id="m133">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf111">
<mml:math id="m134">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf112">
<mml:math id="m135">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf113">
<mml:math id="m136">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf114">
<mml:math id="m137">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">100</td>
<td align="char" char=".">
<bold>0.13</bold>
</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.17</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">
<bold>0.23</bold>
</td>
<td align="char" char=".">0.39</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">
<bold>0.64</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">
<bold>0.31</bold>
</td>
<td align="char" char=".">0.40</td>
<td align="char" char=".">0.20</td>
<td align="char" char=".">
<bold>0.67</bold>
</td>
<td align="char" char=".">0.80</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">
<bold>0.99</bold>
</td>
</tr>
<tr>
<td align="left">300</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">0.22</td>
<td align="char" char=".">
<bold>0.49</bold>
</td>
<td align="char" char=".">0.73</td>
<td align="char" char=".">0.58</td>
<td align="char" char=".">
<bold>0.92</bold>
</td>
<td align="char" char=".">0.98</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">400</td>
<td align="char" char=".">0.37</td>
<td align="char" char=".">0.19</td>
<td align="char" char=".">
<bold>0.61</bold>
</td>
<td align="char" char=".">0.92</td>
<td align="char" char=".">0.86</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">0.99</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">500</td>
<td align="char" char=".">0.51</td>
<td align="char" char=".">0.31</td>
<td align="char" char=".">
<bold>0.76</bold>
</td>
<td align="char" char=".">0.98</td>
<td align="char" char=".">0.96</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="Tfn1">
<p>Bold values indicate the largest power of the test under each condition.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Power of the proposed test for directed graphs with varying sparsity levels.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">
<inline-formula id="inf115">
<mml:math id="m138">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="2" align="center">
<inline-formula id="inf116">
<mml:math id="m139">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.02</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="2" align="center">
<inline-formula id="inf117">
<mml:math id="m140">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.03</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="2" align="center">
<inline-formula id="inf118">
<mml:math id="m141">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
<tr>
<td align="left">
<italic>D</italic>
</td>
<td align="center">
<inline-formula id="inf119">
<mml:math id="m142">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf120">
<mml:math id="m143">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf121">
<mml:math id="m144">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf122">
<mml:math id="m145">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf123">
<mml:math id="m146">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf124">
<mml:math id="m147">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">100</td>
<td align="char" char=".">
<bold>0.13</bold>
</td>
<td align="char" char=".">0.09</td>
<td align="char" char=".">
<bold>0.11</bold>
</td>
<td align="char" char=".">
<bold>0.11</bold>
</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">
<bold>0.26</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">
<bold>0.12</bold>
</td>
<td align="char" char=".">0.25</td>
<td align="char" char=".">
<bold>0.27</bold>
</td>
<td align="char" char=".">0.49</td>
<td align="char" char=".">
<bold>0.66</bold>
</td>
</tr>
<tr>
<td align="left">300</td>
<td align="char" char=".">0.17</td>
<td align="char" char=".">
<bold>0.22</bold>
</td>
<td align="char" char=".">0.46</td>
<td align="char" char=".">
<bold>0.61</bold>
</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">
<bold>0.94</bold>
</td>
</tr>
<tr>
<td align="left">400</td>
<td align="char" char=".">
<bold>0.20</bold>
</td>
<td align="char" char=".">
<bold>0.20</bold>
</td>
<td align="char" char=".">0.60</td>
<td align="char" char=".">
<bold>0.72</bold>
</td>
<td align="char" char=".">0.95</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">500</td>
<td align="char" char=".">0.36</td>
<td align="char" char=".">
<bold>0.37</bold>
</td>
<td align="char" char=".">0.77</td>
<td align="char" char=".">
<bold>0.93</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
</tbody>
</table>
<table>
<thead>
<tr>
<td align="left">
<inline-formula id="inf125">
<mml:math id="m148">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="2" align="center">
<inline-formula id="inf126">
<mml:math id="m149">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.02</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="2" align="center">
<inline-formula id="inf127">
<mml:math id="m150">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.03</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td colspan="2" align="center">
<inline-formula id="inf128">
<mml:math id="m151">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.04</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
<tr>
<td align="left">
<italic>D</italic>
</td>
<td align="center">
<inline-formula id="inf129">
<mml:math id="m152">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf130">
<mml:math id="m153">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi mathvariant="italic">&#x2032;</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf131">
<mml:math id="m154">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf132">
<mml:math id="m155">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi mathvariant="italic">&#x2032;</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf133">
<mml:math id="m156">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td align="center">
<inline-formula id="inf134">
<mml:math id="m157">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mi mathvariant="italic">&#x2032;</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
</thead>
<tbody>
<tr>
<td align="left">100</td>
<td align="char" char=".">0.14</td>
<td align="char" char=".">
<bold>0.18</bold>
</td>
<td align="char" char=".">0.20</td>
<td align="char" char=".">
<bold>0.42</bold>
</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">
<bold>0.93</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">0.26</td>
<td align="char" char=".">
<bold>0.38</bold>
</td>
<td align="char" char=".">0.77</td>
<td align="char" char=".">
<bold>0.94</bold>
</td>
<td align="char" char=".">0.97</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">300</td>
<td align="char" char=".">0.43</td>
<td align="char" char=".">
<bold>0.68</bold>
</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">400</td>
<td align="char" char=".">0.62</td>
<td align="char" char=".">
<bold>0.89</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">500</td>
<td align="char" char=".">0.80</td>
<td align="char" char=".">
<bold>0.96</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="Tfn2">
<p>Bold values indicate the largest power of the test under each condition.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>This observation becomes particularly evident when we have a large number of samples. To this end, we study how the performance of the tests is affected by the number of samples. For this study, we consider <inline-formula id="inf135">
<mml:math id="m158">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mn>10,20,50</mml:mn>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> with relatively small graphs <inline-formula id="inf136">
<mml:math id="m159">
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mn>50,100,150,200</mml:mn>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and fix <inline-formula id="inf137">
<mml:math id="m160">
<mml:mrow>
<mml:mi mathvariant="italic">&#x3f5;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.02</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>. This analysis is designed to reveal the potential impact of sample size in high-dimensional settings. <xref ref-type="table" rid="T3">Tables 3</xref>, <xref ref-type="table" rid="T4">4</xref> report numerical results for the performance of the tests with varying number of samples. We see that the proposed test in general outperforms the existing tests for both undirected and directed graphs. Hence, we can claim that the new test works well in high-dimensional settings.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Power comparison of different tests for undirected graphs with varying sample&#x20;sizes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th colspan="3" align="center">
<inline-formula id="inf138">
<mml:math id="m161">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>10</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="3" align="center">
<inline-formula id="inf139">
<mml:math id="m162">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>20</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="3" align="center">
<inline-formula id="inf140">
<mml:math id="m163">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>50</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
<tr>
<th align="left">
<italic>d</italic>
</th>
<th align="center">
<inline-formula id="inf141">
<mml:math id="m164">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf142">
<mml:math id="m165">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf143">
<mml:math id="m166">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf144">
<mml:math id="m167">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf145">
<mml:math id="m168">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf146">
<mml:math id="m169">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf147">
<mml:math id="m170">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf148">
<mml:math id="m171">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf149">
<mml:math id="m172">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">50</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">
<bold>0.12</bold>
</td>
<td align="char" char=".">0.11</td>
<td align="char" char=".">0.04</td>
<td align="char" char=".">
<bold>0.16</bold>
</td>
<td align="char" char=".">0.28</td>
<td align="char" char=".">0.15</td>
<td align="char" char=".">
<bold>0.43</bold>
</td>
</tr>
<tr>
<td align="left">100</td>
<td align="char" char=".">0.16</td>
<td align="char" char=".">0.08</td>
<td align="char" char=".">
<bold>0.17</bold>
</td>
<td align="char" char=".">0.18</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">
<bold>0.23</bold>
</td>
<td align="char" char=".">0.61</td>
<td align="char" char=".">0.42</td>
<td align="char" char=".">
<bold>0.81</bold>
</td>
</tr>
<tr>
<td align="left">150</td>
<td align="char" char=".">
<bold>0.16</bold>
</td>
<td align="char" char=".">0.03</td>
<td align="char" char=".">0.15</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">0.14</td>
<td align="char" char=".">
<bold>0.30</bold>
</td>
<td align="char" char=".">0.70</td>
<td align="char" char=".">0.52</td>
<td align="char" char=".">
<bold>0.97</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">0.14</td>
<td align="char" char=".">0.06</td>
<td align="char" char=".">
<bold>0.22</bold>
</td>
<td align="char" char=".">0.37</td>
<td align="char" char=".">0.21</td>
<td align="char" char=".">
<bold>0.56</bold>
</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">0.89</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="Tfn3">
<p>Bold values indicate the largest power of the test under each condition.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Power comparison of different tests for directed graphs with varying sample&#x20;sizes.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Directed</th>
<th colspan="2" align="center">
<inline-formula id="inf150">
<mml:math id="m173">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>10</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="2" align="center">
<inline-formula id="inf151">
<mml:math id="m174">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>20</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th colspan="2" align="center">
<inline-formula id="inf152">
<mml:math id="m175">
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>50</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
<tr>
<th align="left">
<italic>d</italic>
</th>
<th align="center">
<inline-formula id="inf153">
<mml:math id="m176">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf154">
<mml:math id="m177">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf155">
<mml:math id="m178">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf156">
<mml:math id="m179">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf157">
<mml:math id="m180">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
<th align="center">
<inline-formula id="inf158">
<mml:math id="m181">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>
</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">50</td>
<td align="char" char=".">0.05</td>
<td align="char" char=".">
<bold>0.09</bold>
</td>
<td align="char" char=".">0.12</td>
<td align="char" char=".">
<bold>0.28</bold>
</td>
<td align="char" char=".">0.49</td>
<td align="char" char=".">
<bold>0.77</bold>
</td>
</tr>
<tr>
<td align="left">100</td>
<td align="char" char=".">0.15</td>
<td align="char" char=".">
<bold>0.24</bold>
</td>
<td align="char" char=".">0.29</td>
<td align="char" char=".">
<bold>0.43</bold>
</td>
<td align="char" char=".">0.82</td>
<td align="char" char=".">
<bold>0.99</bold>
</td>
</tr>
<tr>
<td align="left">150</td>
<td align="char" char=".">0.15</td>
<td align="char" char=".">
<bold>0.21</bold>
</td>
<td align="char" char=".">0.39</td>
<td align="char" char=".">
<bold>0.52</bold>
</td>
<td align="char" char=".">0.95</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
<tr>
<td align="left">200</td>
<td align="char" char=".">0.28</td>
<td align="char" char=".">
<bold>0.42</bold>
</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">
<bold>0.86</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
<td align="char" char=".">
<bold>1.00</bold>
</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="Tfn4">
<p>Bold values indicate the largest power of the test under each condition.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s4-2">
<title>4.2 Real-World Applications</title>
<sec id="s4-2-1">
<title>4.2.1 Phone-Call Network</title>
<p>The MIT Media Laboratory conducted a study following 87 subjects who used mobile phones with a pre-installed device that can record call logs. The study lasted for 330&#xb0;days from July 2004 to June 2005 (<xref ref-type="bibr" rid="B6">Eagle et&#x20;al., 2009</xref>). Given the richness of this dataset, one question of interest to answer is that whether the phone call patterns among subjects are different between weekends and weekdays. These patterns can be viewed as a representation of the personal relationship and professional relationships of a subject. Removing days with no calls among subjects, there are <inline-formula id="inf159">
<mml:math id="m182">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>299</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula> networks in total (corresponding to number of days) and 87 subjects (or nodes) with adjacency matrices <inline-formula id="inf160">
<mml:math id="m183">
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> with value one for element <inline-formula id="inf161">
<mml:math id="m184">
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> if subject <italic>i</italic> called <italic>j</italic> on day <italic>t</italic> and 0 otherwise. This in turn comprises of 85&#xb0;days in weekends and 214&#xb0;days in weekdays. This is an example of unweighted directed graphs with imbalanced sample&#x20;sizes.</p>
<p>The test statistic and corresponding <italic>p</italic>-value are shown in <xref ref-type="table" rid="T5">Table&#x20;5</xref>. We see that the new test rejects the null hypothesis of equal distribution at 0.05 significance level. This outcome is intuitively plausible as phone call patterns in weekends (personal) can be different from the patterns in weekdays (work).</p>
<table-wrap id="T5" position="float">
<label>TABLE 5</label>
<caption>
<p>Test summary on the phone-call network.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Test statistic</th>
<th align="center">
<italic>p</italic>-value</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">15.8131</td>
<td align="center">
<inline-formula id="inf162">
<mml:math id="m185">
<mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:mn>0.001</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-2-2">
<title>4.2.2 Safety-Critical Healthcare Application</title>
<p>Modeling relationships between functional or structural regions in the brain is a significant step toward understanding, diagnosing, and eventually treating a gamut of neurological conditions including epilepsy, stroke, and autism. A variety of sensing mechanisms, such as functional-MRI, Electroencephalography (EEG), and Electrocorticography (ECoG), are commonly adopted to uncover patterns in both brain structure and function. In particular, the resting state fMRI (<xref ref-type="bibr" rid="B16">Kelly et&#x20;al., 2008</xref>) has been proven effective in identifying diagnostic biomarkers for mental health conditions such as the Alzheimer disease (<xref ref-type="bibr" rid="B4">Chen et&#x20;al., 2011</xref>) and autism (<xref ref-type="bibr" rid="B24">Plitt et&#x20;al., 2015</xref>). At the core of these neuropathology studies is predictive models that map variations in brain functionality, obtained as time-series measurements in regions of interest, to clinical scores. For example, the Autism Brain Imaging Data Exchange (ABIDE) is a collaborative effort (<xref ref-type="bibr" rid="B5">Di Martino et&#x20;al., 2014</xref>), which seeks to build a data-driven approach for autism diagnosis. Further, several published studies have reported that predictive models can reveal patterns in brain activity that act as effective biomarkers for classifying patients with mental illness (<xref ref-type="bibr" rid="B24">Plitt et&#x20;al., 2015</xref>). Following current practice (<xref ref-type="bibr" rid="B23">Parisot et&#x20;al., 2017</xref>), graphs are natural data structures to model the functional connectivity of human brain (e.g. fMRI), where nodes correspond to the different functional regions in the brain and edges represent the functional correlations between the regions. The problem of defining appropriate metrics to compare these graphs and thereby identify suitable biomarkers for autism severity has been of significant research interest. We show that the proposed two-sample test is highly effective at characterizing stratification based on demographics (e.g. age, gender) as well as autism severity states (normal vs abnormal) across a large population of brain networks.</p>
<p>In the dataset, there are total 871 graphs and each graph consists of 111 nodes (functional regions). Through this example, we study the effectiveness of our approach under the weighted and undirected graph setting. In particular, we focus on detecting variations across stratification arising from demographics (gender, age). Specifically, groups of normal control subjects as well as those diagnosed with Autism Spectrum Disorders (ADS) are further sub-divided according to their gender (Male or Female) and age (under 20 or over 20), and we compare these sub-groups using the proposed test. <xref ref-type="table" rid="T6">Table&#x20;6</xref> shows the distribution of graphs in the dataset and <xref ref-type="fig" rid="F3">Figure&#x20;3</xref> shows an example of the network structure of normal-male and normal-female groups.</p>
<table-wrap id="T6" position="float">
<label>TABLE 6</label>
<caption>
<p>Distribution of graphs. &#x201c;M&#x201d; and &#x201c;F&#x201d; indicate male and female, respectively. &#x2018;&#x3c;20&#x2019; and &#x2018;&#x3e;20&#x2019; represent age less than 20 and over 20, respectively.<inline-graphic xlink:href="frai-04-589632-fx1.tif"/>
</p>
</caption>
</table-wrap>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Example networks from Normal-Male and Normal-Female groups.</p>
</caption>
<graphic xlink:href="frai-04-589632-g003.tif"/>
</fig>
<p>We conduct the two-sample test based on <inline-formula id="inf163">
<mml:math id="m186">
<mml:mrow>
<mml:msup>
<mml:mi>T</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> for each group with 10,000 permutations and the results are summarized in <xref ref-type="table" rid="T7">Table&#x20;7</xref>. We see that the new test rejects the null hypothesis of homogeneity in groups with respect to the treatment and age at 5% significance level (Normal&#x3e;20 vs ADS&#x3c;20 and Normal&#x3c;20 vs ADS&#x3e;20). In addition, the new test rejects the null hypothesis of homogeneity in both normal and ADS groups with respect to the age difference (Normal&#x3c;20 vs Normal&#x3e;20 and ADS&#x3c;20 vs ADS&#x3e;20).</p>
<table-wrap id="T7" position="float">
<label>TABLE 7</label>
<caption>
<p>
<italic>p</italic>-values of the tests on the ABIDE dataset.<inline-graphic xlink:href="frai-04-589632-fx2.tif"/>
</p>
</caption>
</table-wrap>
<p>This conclusion indicates there is a dataset shift even within the same normal and ADS groups, depending on the age. Hence, the fact that normal and ADS groups are considered differently by age may affect the machine learning subjects classification and prediction task in population. Moreover, with the dataset in which the normal group and ADS group are determined differently by age and not by gender, the machine learning classification and prediction model may not be reliable. Hence, detecting dataset shift shed some light on the machine learning task for more reliable results.</p>
<p>We also compare the new test with the existing method <inline-formula id="inf164">
<mml:math id="m187">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> to this example. Note that the existing method <inline-formula id="inf165">
<mml:math id="m188">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>a</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>y</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> may not be reliable due to the small number of nodes. Since <inline-formula id="inf166">
<mml:math id="m189">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> is only applicable to the balanced sample sizes, we randomly choose 54 graphs from each group as the smallest sample size among the groups is 54. We run the tests 100&#x20;times at the significance level 5%. The test powers are shown in <xref ref-type="table" rid="T8">Table&#x20;8</xref>. We see that the new test in general outperforms <inline-formula id="inf167">
<mml:math id="m190">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>. Compared to the results in <xref ref-type="table" rid="T7">Table&#x20;7</xref>, some examples show inconsistent performance of the tests. This is because we only consider a subset of graphs due to the limitation of the existing approaches in that they cannot be applied to unbalanced sample size examples.</p>
<table-wrap id="T8" position="float">
<label>TABLE 8</label>
<caption>
<p>Estimated power of the tests with the significance level at 5%. Black numbers indicate the power of test based on <inline-formula id="inf168">
<mml:math id="m191">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mo>_</mml:mo>
<mml:mi>f</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> and red numbers represent the power of test based on <inline-formula id="inf169">
<mml:math id="m192">
<mml:mrow>
<mml:mi>T</mml:mi>
<mml:mtext>&#x27;</mml:mtext>
<mml:mo>_</mml:mo>
<mml:mi>n</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>w</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>.<inline-graphic xlink:href="frai-04-589632-fx3.tif"/>
</p>
</caption>
</table-wrap>
</sec>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusion</title>
<p>We propose the new two-sample test statistic for graph-structured data. Unlike the existing methods, the new test statistic is more versatile, which is applicable to directed graphs, imbalanced sample size cases, and even weighted graphs. The asymptotic distribution of the test statistic is presented and a practical testing procedure is proposed. The performance of the new method is studied under a number of settings. Experiments demonstrate that the new test in general outperforms state-of-the-art tests. The proposed test is also applied to two real datasets (including a safety-critical healthcare application), and we reveal that the new approach is effective to detecting the heterogeneity between disparate samples.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>HS developed the main method and proposed the testing procedure based on the new test statistic. He conducted the simulation experiments and real data analysis. JJ and BK provided the intuition and the direction of the method and worked on simulation experiments with HS. JJ provided the real dataset, and JJ and BK discussed about the results with HS. HS, JJ, and BK generated the paper together.</p>
</sec>
<sec id="s9">
<title>Funding</title>
<p>This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. This work was supported by the DOE Advanced Scientific Computing Research. Release number LLNL-JRNL-822138.</p>
</sec>
<sec sec-type="COI-statement" id="s10">
<title>Disclaimer</title>
<p>The views and opinions of the authors do not necessarily reflect those of the U.S. government or Lawrence Livermore National Security, LLC neither of whom nor any of their employees make any endorsements, express or implied warranties or representations or assume any legal liability or responsibility for the accuracy, completeness, or usefulness of the information contained herein.</p>
</sec>
<sec sec-type="COI-statement" id="s11">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bollob&#xe1;s</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Janson</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Riordan</surname>
<given-names>O.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>The Phase Transition in Inhomogeneous Random Graphs</article-title>. <source>Random Struct. Alg.</source> <volume>31</volume>, <fpage>3</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1002/rsa.20168</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bubeck</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Ding</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Eldan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>R&#xe1;cz</surname>
<given-names>M. Z.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Testing for High-Dimensional Geometry in Random Graphs</article-title>. <source>Random Struct. Alg.</source> <volume>49</volume>, <fpage>503</fpage>&#x2013;<lpage>532</lpage>. <pub-id pub-id-type="doi">10.1002/rsa.20633</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Bulusu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kailkhura</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Varshney</surname>
<given-names>P. K.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Anomalous Instance Detection in Deep Learning: A Survey</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://arXiv:2003.06979">arXiv:2003.06979</ext-link>
</comment> (<comment>Accessed March 16, 2020</comment>). </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Ward</surname>
<given-names>B. D.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Classification of Alzheimer Disease, Mild Cognitive Impairment, and Normal Cognitive Status with Large-Scale Network Analysis Based on Resting-State Functional Mr Imaging</article-title>. <source>Radiology</source> <volume>259</volume>, <fpage>213</fpage>&#x2013;<lpage>221</lpage>. <pub-id pub-id-type="doi">10.1148/radiol.10100734</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Di Martino</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>C.-G.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Denio</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Castellanos</surname>
<given-names>F. X.</given-names>
</name>
<name>
<surname>Alaerts</surname>
<given-names>K.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>The Autism Brain Imaging Data Exchange: toward a Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism</article-title>. <source>Mol. Psychiatry</source> <volume>19</volume>, <fpage>659</fpage>&#x2013;<lpage>667</lpage>. <pub-id pub-id-type="doi">10.1038/mp.2013.78</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Eagle</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Pentland</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lazer</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Inferring Friendship Network Structure by Using Mobile Phone Data</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>106</volume>, <fpage>15274</fpage>&#x2013;<lpage>15278</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0900282106</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Gao</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Lafferty</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Testing Network Structure Using Relations between Small Subgraph Probabilities</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://arXiv:1704.06742">arXiv:1704.06742</ext-link>
</comment> (<comment>Accessed April 22, 2017</comment>). </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ghoshdastidar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>von Luxburg</surname>
<given-names>U.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Practical Methods for Graph Two-Sample Testing</article-title>,&#x201d; in <conf-name>Advances in Neural Information Processing Systems</conf-name>, <conf-date>December, 2018</conf-date>, <fpage>3019</fpage>&#x2013;<lpage>3028</lpage>. </citation>
</ref>
<ref id="B9">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Ghoshdastidar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Gutzeit</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Carpentier</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>von Luxburg</surname>
<given-names>U.</given-names>
</name>
</person-group> (<year>2017a</year>). <article-title>Two-sample Hypothesis Testing for Inhomogeneous Random Graphs</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://arXiv:1707.00833">arXiv:1707.00833</ext-link>
</comment> (<comment>Accessed July 4, 2017</comment>). </citation>
</ref>
<ref id="B10">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Ghoshdastidar</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Gutzeit</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Carpentier</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>von Luxburg</surname>
<given-names>U.</given-names>
</name>
</person-group> (<year>2017b</year>). <article-title>Two-sample Tests for Large Random Graphs Using Network Statistics</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://arXiv:1705.06168v2">arXiv:1705.06168v2</ext-link>
</comment> (<comment>Accessed May 26, 2017</comment>). </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ginestet</surname>
<given-names>C. E.</given-names>
</name>
<name>
<surname>Fournel</surname>
<given-names>A. P.</given-names>
</name>
<name>
<surname>Simmons</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Statistical Network Analysis for Functional Mri: Summary Networks and Group Comparisons</article-title>. <source>Front. Comput. Neurosci.</source> <volume>8</volume>, <fpage>51</fpage>. <pub-id pub-id-type="doi">10.3389/fncom.2014.00051</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ginestet</surname>
<given-names>C. E.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Balachandran</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Rosenberg</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Kolaczyk</surname>
<given-names>E. D.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Hypothesis Testing for Network Data in Functional Neuroimaging</article-title>. <source>Ann. Appl. Stat.</source> <volume>11</volume>, <fpage>725</fpage>&#x2013;<lpage>750</lpage>. <pub-id pub-id-type="doi">10.1214/16-aoas1015</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ginestet</surname>
<given-names>C. E.</given-names>
</name>
<name>
<surname>Nichols</surname>
<given-names>T. E.</given-names>
</name>
<name>
<surname>Bullmore</surname>
<given-names>E. T.</given-names>
</name>
<name>
<surname>Simmons</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Brain Network Analysis: Separating Cost from Topology Using Cost-Integration</article-title>. <source>PloS one</source> <volume>6</volume>, <fpage>e21570</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0021570</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hoeffding</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>1992</year>). &#x201c;<article-title>A Class of Statistics with Asymptotically Normal Distribution</article-title>,&#x201d; in <source>Breakthroughs in Statistics (Springer)</source>, <fpage>308</fpage>&#x2013;<lpage>334</lpage>. <pub-id pub-id-type="doi">10.1007/978-1-4612-0919-5_20</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Holland</surname>
<given-names>P. W.</given-names>
</name>
<name>
<surname>Laskey</surname>
<given-names>K. B.</given-names>
</name>
<name>
<surname>Leinhardt</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>1983</year>). <article-title>Stochastic Blockmodels: First Steps</article-title>. <source>Social networks</source> <volume>5</volume>, <fpage>109</fpage>&#x2013;<lpage>137</lpage>. <pub-id pub-id-type="doi">10.1016/0378-8733(83)90021-7</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kelly</surname>
<given-names>A. M. C.</given-names>
</name>
<name>
<surname>Uddin</surname>
<given-names>L. Q.</given-names>
</name>
<name>
<surname>Biswal</surname>
<given-names>B. B.</given-names>
</name>
<name>
<surname>Castellanos</surname>
<given-names>F. X.</given-names>
</name>
<name>
<surname>Milham</surname>
<given-names>M. P.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Competition between Functional Brain Networks Mediates Behavioral Variability</article-title>. <source>Neuroimage</source> <volume>39</volume>, <fpage>527</fpage>&#x2013;<lpage>537</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.08.008</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lehmann</surname>
<given-names>E. L.</given-names>
</name>
<name>
<surname>Romano</surname>
<given-names>J.&#x20;P.</given-names>
</name>
</person-group> (<year>2006</year>). <source>Testing Statistical Hypotheses</source>. <publisher-loc>Berlin, Germany</publisher-loc>: <publisher-name>Springer Science &#x26; Business Media</publisher-name>.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lei</surname>
<given-names>J.</given-names>
</name>
</person-group>, (<year>2016</year>). <article-title>A Goodness-Of-Fit Test for Stochastic Block Models</article-title>. <source>Ann. Stat.</source> <volume>44</volume>, <fpage>401</fpage>&#x2013;<lpage>424</lpage>. <pub-id pub-id-type="doi">10.1214/15-aos1370</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Macindoe</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Richards</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Graph Comparison Using Fine Structure Analysis</article-title>, <conf-name>IEEE Second International Conference on Social Computing</conf-name>. <publisher-name>IEEE</publisher-name>. <pub-id pub-id-type="doi">10.1109/socialcom.2010.35</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Maugis</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Priebe</surname>
<given-names>C. E.</given-names>
</name>
<name>
<surname>Olhede</surname>
<given-names>S. C.</given-names>
</name>
<name>
<surname>Wolfe</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Statistical Inference for Network Samples Using Subgraph Counts</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://arXiv:1701.00505">arXiv:1701.00505</ext-link>
</comment> (<comment>Accessed January 2, 2017</comment>). </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Newman</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Girvan</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>Finding and Evaluating Community Structure in Networks</article-title>. <source>Phys. Rev. E</source> <volume>69</volume>, <fpage>026113</fpage>. <pub-id pub-id-type="doi">10.1103/physreve.69.026113</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Newman</surname>
<given-names>M. E. J.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Modularity and Community Structure in Networks</article-title>. <source>Proc. Natl. Acad. Sci.</source> <volume>103</volume>, <fpage>8577</fpage>&#x2013;<lpage>8582</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0601602103</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Parisot</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Ktena</surname>
<given-names>S. I.</given-names>
</name>
<name>
<surname>Ferrante</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Moreno</surname>
<given-names>R. G.</given-names>
</name>
<name>
<surname>Glocker</surname>
<given-names>B.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). &#x201c;<article-title>Spectral Graph Convolutions for Population-Based Disease Prediction</article-title>,&#x201d; in <conf-name>International Conference On Medical Image Computing and Computer-Assisted Intervention</conf-name>, <conf-loc>QC, Canada</conf-loc>, <conf-date>September 10&#x2013;14, 2017</conf-date> (<publisher-name>Springer</publisher-name>), <fpage>177</fpage>&#x2013;<lpage>185</lpage>. </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Plitt</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Barnes</surname>
<given-names>K. A.</given-names>
</name>
<name>
<surname>Martin</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Functional Connectivity Classification of Autism Identifies Highly Predictive Brain Features but Falls Short of Biomarker Standards</article-title>. <source>NeuroImage: Clin.</source> <volume>7</volume>, <fpage>359</fpage>&#x2013;<lpage>366</lpage>. <pub-id pub-id-type="doi">10.1016/j.nicl.2014.12.013</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rabanser</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>G&#xfc;nnemann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Lipton</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift</article-title>,&#x201d; in <conf-name>33rd Conference on Neural Information Processing Systems</conf-name>, <conf-loc>Vancover, Canada</conf-loc>, <fpage>1396</fpage>&#x2013;<lpage>1408</lpage>. </citation>
</ref>
<ref id="B26">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Serfling</surname>
<given-names>R. J.</given-names>
</name>
</person-group> (<year>2009</year>). <source>
<italic>Approximation Theorems of Mathematical Statistics</italic>
</source>. <publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>John Wiley &#x26; Sons</publisher-name>.</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shervashidze</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Vishwanathan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Petri</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Mehlhorn</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Borgwardt</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2009</year>). &#x201c;<article-title>Efficient Graphlet Kernels for Large Graph Comparison</article-title>,&#x201d; in <conf-name>Artificial Intelligence and Statistics</conf-name>, <fpage>488</fpage>&#x2013;<lpage>495</lpage>. </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Athreya</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Sussman</surname>
<given-names>D. L.</given-names>
</name>
<name>
<surname>Lyzinski</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Priebe</surname>
<given-names>C. E.</given-names>
</name>
</person-group> (<year>2017a</year>). <article-title>A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs</article-title>. <source>J.&#x20;Comput. Graphical Stat.</source> <volume>26</volume>, <fpage>344</fpage>&#x2013;<lpage>354</lpage>. <pub-id pub-id-type="doi">10.1080/10618600.2016.1193505</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Athreya</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Sussman</surname>
<given-names>D. L.</given-names>
</name>
<name>
<surname>Lyzinski</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Priebe</surname>
<given-names>C. E.</given-names>
</name>
</person-group> (<year>2017b</year>). <article-title>A Nonparametric Two-Sample Hypothesis Testing Problem for Random Graphs</article-title>. <source>Bernoulli</source> <volume>23</volume>, <fpage>1599</fpage>&#x2013;<lpage>1630</lpage>. <pub-id pub-id-type="doi">10.3150/15-bej789</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>
