<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Future Transp.</journal-id>
<journal-title>Frontiers in Future Transportation</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Future Transp.</abbrev-journal-title>
<issn pub-type="epub">2673-5210</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">644988</article-id>
<article-id pub-id-type="doi">10.3389/ffutr.2021.644988</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Future Transportation</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Estimation of Traffic Flow Rate With Data From Connected-Automated Vehicles Using Bayesian Inference and Deep Learning</article-title>
<alt-title alt-title-type="left-running-head">Han and Ahn</alt-title>
<alt-title alt-title-type="right-running-head">Flow Rate Estimation with CAV</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Han</surname>
<given-names>Youngjun</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1170631/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ahn</surname>
<given-names>Soyoung</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/950655/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<label>
<sup>1</sup>
</label>Department of Transportation System Research, The Seoul Institute, <addr-line>Seoul</addr-line>, <country>South Korea</country>
</aff>
<aff id="aff2">
<label>
<sup>2</sup>
</label>Department of Civil and Environmental Engineering, University of Wisconsin-Madison, <addr-line>Madison</addr-line>, <addr-line>WI</addr-line>, <country>United&#x20;States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/932059/overview">Monica Menendez</ext-link>, New York University Abu Dhabi, United Arab Emirates</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/984311/overview">Kaidi Yang</ext-link>, Stanford University, United&#x20;States</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/970878/overview">Saif Eddin Ghazi Jabari</ext-link>, New York University Abu Dhabi, United Arab Emirates</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Youngjun Han, <email>yjhan@si.re.kr</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Transportation Systems Modeling, a section of the journal Frontiers in Future Transportation</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>18</day>
<month>03</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>2</volume>
<elocation-id>644988</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>12</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>02</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Han and Ahn.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Han and Ahn</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Connected automated vehicles (CAVs) hold promise to replace current traffic detection systems in the near future. However, traffic state estimation, particularly flow rate, poses a major challenge at low CAV penetration rates without other supporting infrastructure of sensors. This paper proposes flow rate estimation methods using headway data from CAVs. Specifically, Bayesian inference and deep learning based methods are developed and compared with a na&#xef;ve method based on a simple arithmetic mean of observed headways. The proposed methods are investigated via numerical experiments to evaluate their performance with respect to the CAV penetration rate, traffic demand, and availability of historical data. The methods are further validated with real data. The results show that the Bayesian inference based method, which estimates the flow rate distribution by integrating current (real-time) data and previous knowledge, can perform well even at low penetration rates with good prior information. However, in high CAV penetration, its relative advantage to the other methods diminishes because the prior information always influences the flow rate estimation. The deep learning based method can be effective with a large amount of data to train the model; however, in low CAV penetration, it tends to converge to the mean of target output values regardless of the observed data. At last, in relatively high CAV penetration, the relative advantage of the advanced methods is negligible and in fact, the na&#xef;ve method is preferred in terms of accuracy as well as efficiency.</p>
</abstract>
<kwd-group>
<kwd>connected automated vehicle</kwd>
<kwd>traffic flow rate estimation</kwd>
<kwd>deep learning</kwd>
<kwd>bayesian inference</kwd>
<kwd>NGSIM data</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Traffic data collected by various detector systems is fundamental to traffic operations. Conventional detectors, such as inductive loop detectors, typically provide vehicle speed, flow rate, and occupancy at fixed locations, and traffic states can be estimated using these data. On the other hand, connected-automated vehicles (CAVs) are expected to be on our roads in the near future and fundamentally change how we sense and control traffic. CAVs can collect detailed and accurate data about themselves and the surrounding vehicles through advanced sensing, and they can share these high-resolution data in real time through V2V (vehicle-to-vehicle) or V2I (vehicle-to-infrastructure) communication. Since CAVs can collect and provide traffic data, they can replace the current infrastructure-based detector systems which are costly to install and maintain. Recognizing this potential, a number of advanced concepts of traffic control using CAVs have emerged in recent years (<xref ref-type="bibr" rid="B15">Hegyi et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B36">Roncoli et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B14">Han et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B13">Han and Ahn, 2018</xref>).</p>
<p>In early stages of CAV adoption, traffic data may be obtained from both traditional detectors and CAVs. However, with the high cost of detector maintenance, there may be desire for agencies to phase out traditional detectors quickly if CAV data alone can provide sufficient information. Furthermore, in many areas, detector coverage is not sufficient enough to estimate traffic states with reasonable accuracy. Thus, deriving traffic information mainly using CAV data could reduce reliance on traditional sensors and extend the data collection coverage. The initial low penetration rate of CAVs, however, is a significant obstacle to obtain reliable traffic information. To overcome this, various methods to estimate traffic states using limited data from CAVs, connected vehicles (CVs), or probe vehicles have been widely developed in the literature (<xref ref-type="bibr" rid="B38">Seo et&#x20;al., 2017</xref>). For example, <xref ref-type="bibr" rid="B3">Bekiaris-Liberis et&#x20;al. (2016)</xref> presented a macroscopic model-based approach to estimate density and flow rates in mixed traffic of conventional and connected vehicles. They used data of the average speed of CVs, assuming that it is similar to the average speed of the entire traffic flow, and a total flow rate from conventional detectors. The proposed method was validated via microscopic simulation considering a low penetration rate of CVs (<xref ref-type="bibr" rid="B11">Fountoulakis et&#x20;al., 2017</xref>). Later, <xref ref-type="bibr" rid="B2">Bekiaris-Liberis et&#x20;al. (2017)</xref> also developed a traffic state (per-lane density, on-ramp and off-ramp flows) estimation method using CV data with total flow from fixed detectors. This method was evaluated in microscopic simulation with NGSIM data (<xref ref-type="bibr" rid="B30">Papadopoulou et&#x20;al., 2018</xref>). While these previous studies demonstrate satisfactory estimation of traffic states in low penetration of CVs, they still require conventional detectors, particularly for flow rate, albeit fewer than what the current detecting system requires.</p>
<p>On the other hand, <xref ref-type="bibr" rid="B39">Seo et&#x20;al. (2015)</xref> developed a flow and density estimation method based on the Edie&#x2019;s generalized definitions (<xref ref-type="bibr" rid="B6">Edie, 1963</xref>) only using data from probe vehicles that have ability to detect spacing with its leading vehicle. They performed a field experiment with 20 probe vehicles and verified that the proposed method could effectively capture important traffic dynamics such as queue propagation, even at a very low penetration rate of probe vehicles. Similarly, <xref ref-type="bibr" rid="B40">Seo and Kusakabe (2015)</xref> developed a method to estimate traffic states from probe vehicle data using the flow conservation law. They estimated the number of vehicles between two neighboring probe vehicles based on their average headways (over distance) with their respective leaders (non-probe vehicles) and the average time (over distance) interval between the probe vehicles. These methods clearly present the possibility of using CAV-only data to estimate traffic states, and the simple conservation law enhances the accuracy without any exogenous assumptions such as a fundamental diagram. However, they assumed that the relationship between a probe vehicle and its leading vehicle represents the traffic state at large, and therefore, significant error is expected when the headway deviation among vehicles is large, particularly in free-flow traffic. Thus, reliable estimation of traffic states, particularly flow rate, only using CAVs remains a major challenge.</p>
<p>The methods introduced above are grounded on sound traffic flow theory. Nevertheless, they show limitations in their performance or applications largely due to their limited ability to capture complex features in the traffic data. On the other hand, state-of-the-art data-driven methods have emerged to address feature complexity and to overcome data scarcity. Among them, Bayesian inference is a pioneering method in Statistics to derive results particularly when data is limited. This method estimates a conditional distribution on the observed data by integrating prior knowledge. In traffic engineering, Bayesian methods are widely used to estimate capacity (<xref ref-type="bibr" rid="B29">Ozguven and Ozbay, 2008</xref>), travel time (<xref ref-type="bibr" rid="B17">Jintanakul et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B10">Fei et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B16">Hofleitner et&#x20;al., 2012</xref>), or traffic state (<xref ref-type="bibr" rid="B27">Neumann et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B21">Kim and Wang, 2016</xref>). Since traffic exhibits recurrent daily patterns, past traffic information can complement limited real-time data from CAVs. Thus, Bayesian inference is a good candidate method to estimate traffic states in a CAV environment. Nonetheless, research in this regard is largely missing in the current literature.</p>
<p>Another promising data-driven method is machine learning algorithms, such as deep learning. Despite its inability to provide physical insight, a notable advantage of deep learning is that it can capture complex features of data to describe a target value even if the relationship is nonlinear and too complex to describe by conventional methods. In traffic engineering, deep learning is widely used in many areas such as vehicle behavior modeling (<xref ref-type="bibr" rid="B43">Wei et&#x20;al., 2010</xref>; <xref ref-type="bibr" rid="B20">Khodayari et&#x20;al., 2012</xref>; <xref ref-type="bibr" rid="B25">Mathew and Ravishankar, 2012</xref>; <xref ref-type="bibr" rid="B44">Zheng et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B31">Papathanasopoulou and Antoniou, 2015</xref>; <xref ref-type="bibr" rid="B42">Simonelli et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B22">Lefevre et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B26">Motamedidehkordi et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B45">Zhou et&#x20;al., 2017</xref>) and future traffic state predictions (<xref ref-type="bibr" rid="B24">Ma et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B12">Fusco et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B18">Julio et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B35">Polson and Sokolov, 2017</xref>). For example, Polson and Sokolov (<xref ref-type="bibr" rid="B35">Polson and Sokolov, 2017</xref>) developed a deep learning architecture for short-term flow prediction. The proposed model was validated with loop-detector data in the Chicago area and showed reliable prediction performance in capturing nonlinear changes of flow rate. Clearly, deep learning has the huge potential to link (sparse) CAV data to traffic states at large, but its potential has not been fully explored, including estimation and prediction of flow&#x20;rate.</p>
<p>Based on the above review, we find that advanced data-driven methods have the potential to provide better estimation and prediction capabilities. However, a systematic investigation into their advantages and their limitations for traffic flow estimation is currently lacking. To this end, this paper aims to address 1) whether promising data-driven methods can be used to estimate traffic states, more specifically flow rates in free-flow traffic, using sparse CAV data; 2) how these methods perform in different traffic conditions (e.g., demand, CAV penetration rate); and 3) how much better these methods can perform compared to the simple average approach. Specifically, we consider three methods: 1) a na&#xef;ve method that relies only on the observed CAV data as a baseline, 2) Bayesian inference based method that integrates real time CAV data and historical traffic data, and 3) deep learning based method that extracts complex relations between CAV headways and traffic state directly from large amount of data. This paper evaluates three methods through numerical experiments and validates them with real data. The evaluation results show how the performance of each model fares against others in different traffic situations (e.g., different flow rates, CAV penetration rates, etc.), casting light on in what situation each method should be preferred.</p>
<p>Note that we focus on estimating the flow rate in free-flow states because it is an important indicator for predicting traffic breakdown (<xref ref-type="bibr" rid="B7">Elefteriadou et&#x20;al., 1995</xref>; <xref ref-type="bibr" rid="B33">Persaud et&#x20;al., 1998</xref>; <xref ref-type="bibr" rid="B13">Han and Ahn, 2018</xref>). A major challenge is that in a free flow state, vehicle headways (both conventional vehicles and CAVs) are distributed randomly due to the randomness in vehicle arrivals (i.e.,&#x20;dictated by the demand). Therefore, partial CAV headway data may not represent the flow rate of traffic at large. On the other hand, in a congested state, vehicle headways show less variation as vehicles are constrained, and random arrivals are much less likely. Thus, we expect partial CAV headways to represent the flow rate better in congested traffic. In addition, speed estimation from CAV data is more straightforward as the partial CAV speed is similar to the traffic speed (<xref ref-type="bibr" rid="B8">Elfar et&#x20;al., 2018</xref>). However, speed does not vary significantly in free-flow traffic and thus, is not a good indicator for predicting traffic breakdown.</p>
<p>The main findings of this paper are as follows. The proposed Bayesian inference based method can show good performance even at a low CAV penetration rate (&#x3c; 20%) due to its reliance on prior (historical) information. However, as the CAV penetration or demand increases, its relative advantage to the other methods (a deep learning based method and even a simple average) wanes since the prior information will always influence the flow rate estimation. Particularly, in high CAV penetration, where real-time CAV information alone suffices for accurate flow estimation, inclusion of prior information can actually hinder the accuracy. The narrower the prior distribution is, the stronger the influence of prior information would be for flow estimation. In contrast, the deep learning based method is effective for estimating the flow rate using only CAV data when the CAV penetration rate is moderate to high (&#x3e;20%). However, when the data is sparse (in light traffic or low penetration), the method produces an estimate close to the mean of the training data regardless the observed real-time data. Finally, at a relatively high CAV penetration rate (&#x3e;70%), the relative advantage of the advanced methods is negligible, and in fact, the na&#xef;ve method is preferred in terms of accuracy as well as efficiency.</p>
<p>This paper consists of five sections. <italic>Methods</italic> describes the proposed methods, and <italic>Numerical Experiment</italic> describes the numerical experiments to investigate the features of each method in various traffic conditions. In <italic>Validation With Real Data</italic>, the methods are validated with real data, and conclusion and discussion are provided in Section <italic>Conclusion and Discussion</italic>.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<p>This section presents methods that estimate a flow rate using CAV data. Firstly, we assume a CAV will share its own state (e.g., location, speed) with roadside infrastructure and also measure surrounding vehicles (e.g., spacing, relative speed) through its sensors. In this context, we consider that the following data are available over time from CAVs, as illustrated in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>.<list list-type="bullet">
<list-item>
<p>Location, <inline-formula id="inf1">
<mml:math id="m1">
<mml:mi>l</mml:mi>
</mml:math>
</inline-formula>, and Speed, <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>, of&#x20;CAV.</p>
</list-item>
<list-item>
<p>Spacing between CAV and its leading vehicle, <inline-formula id="inf3">
<mml:math id="m3">
<mml:mrow>
<mml:mtext>&#xa0;</mml:mtext>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula>.<xref ref-type="fn" rid="FN1">
<sup>1</sup>
</xref>
</p>
</list-item>
</list>
</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Illustration of available data from CAVs over&#x20;time.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g001.tif"/>
</fig>
<p>For simplicity, we also assume the data from CAVs have negligible error. Using these data, we can easily estimate (time) headway between a CAV and its leading vehicle, <inline-formula id="inf4">
<mml:math id="m4">
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mo>&#xa0;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>/</mml:mo>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>. And, using headway data in a certain time interval, <inline-formula id="inf5">
<mml:math id="m5">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>, a flow rate will be estimated through the proposed methods in the following subsections<xref ref-type="fn" rid="FN2">
<sup>2</sup>
</xref>. We assume that CAV data can be collected continuously over time and location, and thus, the flow rate can be estimated in the entire time-space domain.</p>
<sec id="s2-1">
<title>Method 1: Na&#xef;ve Method (Baseline)</title>
<p>The first method is the simplest but na&#xef;ve method that relies only on observed CAV data. Other traffic information is assumed unavailable. This method will serve as the baseline to evaluate the performance of the more advanced methods, methods 2 and 3. In this method, the arithmetic mean of headways is used to estimate a flow rate, <inline-formula id="inf6">
<mml:math id="m6">
<mml:mrow>
<mml:mo>&#xa0;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, expressed as:<disp-formula id="e1">
<mml:math id="m7">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>E</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>where <inline-formula id="inf7">
<mml:math id="m8">
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the headway between <inline-formula id="inf8">
<mml:math id="m9">
<mml:mrow>
<mml:msup>
<mml:mi>i</mml:mi>
<mml:mrow>
<mml:mtext>th</mml:mtext>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> CAV and its leading vehicle, and <inline-formula id="inf9">
<mml:math id="m10">
<mml:mi>N</mml:mi>
</mml:math>
</inline-formula> is the number of CAVs in the time interval, <inline-formula id="inf10">
<mml:math id="m11">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>. Then, the standard error of <inline-formula id="inf11">
<mml:math id="m12">
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf12">
<mml:math id="m13">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, is<disp-formula id="e2">
<mml:math id="m14">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c3;</mml:mi>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mi>&#x3b4;</mml:mi>
<mml:mrow>
<mml:msqrt>
<mml:mi>N</mml:mi>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>where <inline-formula id="inf13">
<mml:math id="m15">
<mml:mi>&#x3b4;</mml:mi>
</mml:math>
</inline-formula> is the standard deviation of headway for all vehicles, including CAVs and conventional vehicles. <xref ref-type="disp-formula" rid="e2">Equation 2</xref> shows that this method is affected by 1) traffic state, 2) penetration rate of CAVs, and 3) time interval, <inline-formula id="inf14">
<mml:math id="m16">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>: the estimated flow rate would not be precise when <inline-formula id="inf15">
<mml:math id="m17">
<mml:mi>&#x3b4;</mml:mi>
</mml:math>
</inline-formula> is large (e.g., in a free flow state) or <inline-formula id="inf16">
<mml:math id="m18">
<mml:mi>N</mml:mi>
</mml:math>
</inline-formula> is small (e.g., a low CAV penetration rate or small&#x20;<inline-formula id="inf17">
<mml:math id="m19">
<mml:mi>T</mml:mi>
</mml:math>
</inline-formula>).</p>
</sec>
<sec id="s2-2">
<title>Method 2: Bayesian Inference</title>
<p>In many instances, some historical traffic data can be available (from multiple days to years). This historical data could provide some sense of traffic state for certain time and location. Alone, it is obviously not adequate for traffic state estimation due to daily variations, but when combined with real-time data, it can improve the accuracy of traffic state estimation. In Statistics, Bayesian inference has been developed to systematically integrate a (limited) real time data and (related) other information. In a similar context, we develop a Bayesian inference based method to estimate flow rates using real time CAV data and distribution of flow rate from historical data&#x20;set.</p>
<p>Specifically, this method derives a <italic>posterior</italic> probability distribution of flow rate with respect to the observed headways, <inline-formula id="inf18">
<mml:math id="m20">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, with a <italic>prior</italic> probability of flow rate, <inline-formula id="inf19">
<mml:math id="m21">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, and a <italic>likelihood</italic> function of flow rate and headway, <inline-formula id="inf20">
<mml:math id="m22">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, using Bayes&#x2019; theorem:<disp-formula id="e3">
<mml:math id="m23">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#xd7;</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>h</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#xd7;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x220f;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mrow>
<mml:mo>&#x222b;</mml:mo>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#xd7;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x220f;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>d</mml:mtext>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>Note that the denominator is a normalizing factor to ensure that the sum of the posterior distribution equals to one. Thus, for simplicity, <inline-formula id="inf21">
<mml:math id="m24">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> can be written as,<disp-formula id="e4">
<mml:math id="m25">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x221d;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#xd7;</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#xd7;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x220f;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
</p>
<p>Notably, to estimate <inline-formula id="inf22">
<mml:math id="m26">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> by Bayesian inference, more information of <inline-formula id="inf23">
<mml:math id="m27">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf24">
<mml:math id="m28">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> are required. Firstly, <inline-formula id="inf25">
<mml:math id="m29">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> represents a prior distribution of flow rate before collecting current headway data. As stated earlier, the flow rate is expected to fluctuate over time but exhibit similar daily patterns (e.g., typical AM or PM rush hour). Thus historical flow rate data for the same time of the day (and the same day of the week) would be the most reasonable choice for prior information. Note that for the prior data (and training data for Method 3), getting historical data could be a main constraint for adopting these methods. Data from existing detectors can be used if available, but additional surveying would be required if no existing data or detectors are available. Historic estimation results based on previous CAV data can be used, though there will be some transition period until sufficient data become available. Nevertheless, using CAV data with the proposed methods could reduce the efforts to collect traffic data and could estimate traffic states even in the areas without any detectors. A likelihood function represents the headway distribution with respect to flow rate. Field observations can be used to estimate this function, though this has been widely studied in the literature [see (<xref ref-type="bibr" rid="B23">Li and Chen, 2017</xref>) for a recent review].</p>
<p>These model features suggest that the estimation results will depend on the prior information. Specifically, the estimation results would suffer when the prior information provides little information (e.g., a very wide prior distribution), constrains too much (e.g., a very narrow prior distribution), or differs from the true value significantly (e.g., distinct flow rate from prior distribution). In Sections <italic>Numerical Experiment</italic> and <italic>Validation With Real Data</italic>, we will verify these features more systematically through numerical experiments and validation with real data, and provide some insight when we should expect the Bayesian inference based model to perform well or&#x20;poor.</p>
</sec>
<sec id="s2-3">
<title>Method 3: Deep-Learning Based Method</title>
<p>With advancement of data processing techniques, more data-driven methods such as deep learning have been widely developed. Unlike the Bayesian approach, which requires both fundamental knowledge of traffic flow (for the likelihood function) and existing data (for the prior distribution), deep learning aims to extract outcomes (e.g., traffic flow) directly from data without relying on a physical model. Deep learning has been applied in a wide variety of disciplines due to its high accuracy when it is trained by a large amount of data, though it does not provide physical insights. Therefore, in this study, we propose a deep learning based method to estimate the flow rate directly from CAV data. Note that, in a free flow state, especially in a low CAV penetration rate, the relationship between the observed CAV data and flow rate cannot be easily described by a physical model due to the randomness in vehicle arrivals. Thus, a data-driven method, such as the one proposed in this paper, may be more effective in capturing the complex relationship.</p>
<p>
<xref ref-type="fig" rid="F2">Figure&#x20;2</xref> presents the architecture of the proposed deep learning based method with two hidden layers (with ten nodes) and one output layer (with one node). Note that we use two hidden layers as we found during a numerical experiment that the model performance does not improve significantly with more hidden layers. Nonetheless, the architecture can be modified based on the data properties without changing the proposed framework. To train the model, initially, the input data of CAV headways, <inline-formula id="inf26">
<mml:math id="m30">
<mml:mrow>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>h</mml:mi>
<mml:mi>N</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, are connected to each node in the first hidden layer through the weight matrix of <inline-formula id="inf27">
<mml:math id="m31">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">W</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>w</mml:mi>
<mml:mrow>
<mml:mn>1,1</mml:mn>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>w</mml:mi>
<mml:mrow>
<mml:mn>10</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
<mml:mo>&#xa0;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. Each node generates net input <inline-formula id="inf28">
<mml:math id="m32">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">n</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, with bias <inline-formula id="inf29">
<mml:math id="m33">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf30">
<mml:math id="m34">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">n</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> will be transformed to output vector <inline-formula id="inf31">
<mml:math id="m35">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> through activation function <inline-formula id="inf32">
<mml:math id="m36">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, as presented in the figure. Then <inline-formula id="inf33">
<mml:math id="m37">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> becomes the input vector to the second hidden layer, and same process is repeated to generate <inline-formula id="inf34">
<mml:math id="m38">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>. The output layer has only one node that generates <inline-formula id="inf35">
<mml:math id="m39">
<mml:mrow>
<mml:msup>
<mml:mi mathvariant="bold-italic">n</mml:mi>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, which will be transformed to a final output vector for estimated flow rates, <inline-formula id="inf36">
<mml:math id="m40">
<mml:mrow>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula>, via activation function <inline-formula id="inf37">
<mml:math id="m41">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>. The activation functions in the two hidden layers, <inline-formula id="inf38">
<mml:math id="m42">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mn>1</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf39">
<mml:math id="m43">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, are rectified linear unit (ReLU) functions and the activation function in the output layer, <inline-formula id="inf40">
<mml:math id="m44">
<mml:mrow>
<mml:msup>
<mml:mi>f</mml:mi>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</inline-formula>, is a linear function for scaling. The output vector of <inline-formula id="inf41">
<mml:math id="m45">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi mathvariant="bold-italic">q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf42">
<mml:math id="m46">
<mml:mi>M</mml:mi>
</mml:math>
</inline-formula> represents the number of datasets for training, will be compared with the target vector of <inline-formula id="inf43">
<mml:math id="m47">
<mml:mi mathvariant="bold-italic">q</mml:mi>
</mml:math>
</inline-formula>, the ground-truth, to tune the weights and biases through backpropagation algorithm (<xref ref-type="bibr" rid="B37">Rumelhart et&#x20;al., 1986</xref>) that aims to minimize the objective function of mean square error (MSE), expressed as:<disp-formula id="e5">
<mml:math id="m48">
<mml:mrow>
<mml:mtext>MSE</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>
</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Architecture of the proposed Deep-learning method [Reconstructed from (<xref ref-type="bibr" rid="B1">Beale et&#x20;al., 2015</xref>) and (<xref ref-type="bibr" rid="B19">Jun et&#x20;al., 2017</xref>)].</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g002.tif"/>
</fig>
<p>After training, this model can estimate flow rates with a new set of headway data. Notably, the deep learning based model does not require any assumptions for traffic flow properties such as the likelihood function in the Bayesian approach. However, as we will show later, its accuracy is close to and sometimes better than the accuracy of the Bayesian approach. Note that for the proposed deep learning based method, we used a simple &#x201c;vanilla&#x201d; neural network with the assumption that there is no specific relationship between the order of headways and the flow rate since CAVs are randomly distributed in traffic flow. If the headway sequence is deemed significant, though unlikely in most foreseeable conditions, Recurrent Neural Network (RNN) or Long Short Term Memory (LSTM) Networks would be more suitable to estimate the flow rate. More discussion on deep learning application will be provided in the conclusion.</p>
<p>In the following sections, we will investigate the features of deep learning based method in detail and verify that this method can be effective for estimating the flow rate using only observed CAV data. However, when the relationship between the flow rate and CAV data are too weak (e.g., light traffic or a low CAV penetration rate), this method fails to provide meaningful results as it only aims to minimize the objective function (<xref ref-type="disp-formula" rid="e5">Eq. 5</xref>). The detailed results and insights will be presented&#x20;later.</p>
</sec>
</sec>
<sec id="s3">
<title>Numerical Experiment</title>
<sec id="s3-1">
<title>Numerical Experiment Set-Up</title>
<p>To investigate the features of proposed methods, we conduct a numerical experiment in this section. For the headway data, we generate 1,000 data sets that include 100 headways for each, and each headway is randomly generated from an exponential distribution with a mean of 1.8&#xa0;s (equivalent to a flow rate of 2,000&#xa0;veh/hr). The cases for light and heavy traffic demand are also investigated in Section <italic>Effects of Traffic Demand on Flow Rate Estimation</italic>. Note that, we use an exponential distribution to generate random vehicle arrivals in a free flow state, but it can be changed to any distribution. The actual flow rate for each data set can be derived as a reciprocal of the mean of the 100 headways, and the 1,000 data sets represent a wide range of flow rates as illustrated in <xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>. Note that by the central limit theorem, the mean of the 100 headways will be approximately normally distributed with the mean of 1.8&#xa0;s (the population mean) and the standard error of <inline-formula id="inf44">
<mml:math id="m49">
<mml:mrow>
<mml:mn>0.18</mml:mn>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>1.8</mml:mn>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mn>100</mml:mn>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mtext>s</mml:mtext>
</mml:mrow>
</mml:math>
</inline-formula>. Among the 100 headway data in each data set, we randomly select headways according to the assumed penetration rate of CAVs. For example, if the penetration rate is 30%, 30 headways are used in each data set to estimate a flow&#x20;rate.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Examples of flow rate histogram <bold>(A)</bold> actual flow rate; <bold>(B)</bold> flow rate from prior distribution.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g003.tif"/>
</fig>
<p>For the Bayesian inference method, additional information on the prior distribution, <inline-formula id="inf45">
<mml:math id="m50">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>q</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, should be defined. We consider that historical flow rates are described by a bell-shaped gamma distribution with the mean of 2,000&#xa0;veh/hr and the standard deviation of 500&#xa0;veh/hr to represent typical traffic features recurrent daily patterns. We assume a relatively large deviation for the initial experiment to represent a less optimistic scenario of limited prior information, but a sensitivity analysis for smaller and larger standard deviations is also conducted in Section <italic>Effects of Prior Distribution on Flow Rate Estimation</italic>. The example of 50 flow rates from the assumed prior distribution is illustrated in <xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>. The figure shows that the historical flow rates are more concentrated near the true mean of 2,000&#xa0;veh/hr, but the range is quite large (e.g., 1,200&#x2013;3,500&#xa0;veh/hr), which makes it unsuitable for real-time flow rate estimation. Instead, in the Bayesian inference based method, this prior distribution will be updated with real-time CAV data for more accurate flow rate estimation. The likelihood function of headway for given flow rate, <inline-formula id="inf46">
<mml:math id="m51">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, is assumed as exponential distribution to characterize random vehicle arrivals in a free flow traffic.</p>
<p>For the deep learning based method, we divide 1,000 data sets into three groups: 70% for training, 15% for <italic>validation</italic>, and 15% for <italic>test</italic>.<xref ref-type="fn" rid="FN3">
<sup>3</sup>
</xref> The <italic>validation</italic> data set is used as an extension of training to avoid overfitting and improve generalization (<xref ref-type="bibr" rid="B34">Piotrowski and Napiorkowski, 2013</xref>). After training, the <italic>test</italic> data set is used to estimate flow rates. Note that 150 estimated flow rates are compared against the &#x2018;&#x2018;ground truth&#x2019;&#x2019; for deep learning based method, while 1,000 flow rates are estimated and evaluated for other methods.</p>
</sec>
<sec id="s3-2">
<title>Overall Results and Findings</title>
<p>
<xref ref-type="fig" rid="F4">Figures 4A&#x2013;C</xref> present scatter plots of ground-truth (<italic>x</italic>-axis) vs. estimated (<italic>y</italic>-axis) flow rates by each method with different CAV penetration rates (10&#x2013;70%), and <xref ref-type="fig" rid="F4">Figure&#x20;4D</xref> shows the root mean square error (RMSE) for each case. Note that we present RMSE instead MSE to get a better sense of error in flow rate estimation. When the penetration rate of CAV is relatively high (&#x3e;70%), all three methods perform well, but at a low penetration rate (10%), each method shows different features.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Results of numerical experiment: <bold>(A)</bold> Na&#xef;ve method; <bold>(B)</bold> Bayesian inference; <bold>(C)</bold> Deep-learning; <bold>(D)</bold> RMSE vs. CAV penetration rate for each method.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g004.tif"/>
</fig>
<p>The baseline, na&#xef;ve method, as expected, shows dispersive results in low CAV penetration as presented in the left side of <xref ref-type="fig" rid="F4">Figure&#x20;4A</xref>: the estimated flow rate exhibits a wide range of 1,000&#x2013;4,000&#xa0;veh/hr although the actual flow rate is within 1,500&#x2013;2,500&#xa0;veh/hr. This is due to the fact that the headways from CAVs at a low penetration rate have a large deviation, leading to estimate with low accuracy and precision as evidenced by a large RMSE value in <xref ref-type="fig" rid="F4">Figure&#x20;4D</xref>.</p>
<p>The methods based on the Bayesian inference (<xref ref-type="fig" rid="F4">Figure&#x20;4B</xref>) and deep learning (<xref ref-type="fig" rid="F4">Figure&#x20;4C</xref>) present different features. Compared to the na&#xef;ve method, the results from the Bayesian inference show the tendency, though scattered, to follow the reference line even at a low CAV penetration rate. This feature can be explained by the process of Bayesian inference, which reflects the information from both observed data (through the likelihood function) and distribution of historical ground truth (through the prior distribution): the probability of flow rate is initially determined by the prior distribution but gets updated with observed headways. <xref ref-type="fig" rid="F5">Figure&#x20;5</xref> presents an example to better illustrate the process. In this example, the actual flow rate (from 100 headways) is 2,375&#xa0;veh/hr as marked by the left (red) dashed vertical line, and ten headways are available (10% penetration), with a mean of 1.06&#xa0;s. Before updating with CAV data, we initially have a prior distribution, as represented by the left-most (black) curve. Note that, as assumed above, the prior distribution is a gamma distribution with a mean of 2,000&#xa0;veh/hr and the deviation of 500&#xa0;veh/hr. With CAV headways, we can derive a likelihood function as represented by the right-most (blue) curve. Notably, the likelihood function only contains the information from CAV data, and its mode (3,399&#xa0;veh/hr) is same as the estimation by the na&#xef;ve method. In the Bayesian process, we derive a posterior distribution for flow rate by incorporating the prior distribution and the likelihood function using <xref ref-type="disp-formula" rid="e4">Eq. 4</xref>: see the middle (orange) curve in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>. In this example, the posterior mean is 2,467&#xa0;veh/hr, and the mode is 2,376&#xa0;veh/hr, both of which are closer to the actual flow rate than the prior information or observed data (na&#xef;ve method).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Example of Bayesian inference process.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g005.tif"/>
</fig>
<p>In contrast, at a low CAV penetration rate (10%), the deep learning based method generates estimated flow rates around 2,000&#xa0;veh/hr (the mean of the ground-truth) regardless the observed data (see the left-most in <xref ref-type="fig" rid="F4">Figure&#x20;4C</xref>). This feature is inherent to the deep learning process as presented in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>. Deep learning seeks to determine the weights and biases in the hidden layers that minimize the objective function. When the relationship between the input data (observed headways) and the target value (flow rate) is weak due to a large variation in the input data, the learning process decides that the weights are close to zero but selects the biases close to the mean of the target values in an effort to minimize the objective function. As a result, the estimated results converge to near 2000&#xa0;veh/hr, the mean of the training data, even though the estimated results are unrealistic. With increasing penetration rates, however, the learning process finds stronger relations between observed headways and the target flow rates, and thus, estimates flow rates accurately and reliably as presented in <xref ref-type="fig" rid="F4">Figures 4C,D</xref>. The results suggest that the deep learning based method can be an effective method only when a sufficient amount of CAV data is available (i.e.,&#x20;in moderate to high CAV penetration). For a deeper investigation of the deep learning based method, we also estimate flow rates using the conventional data driven method of multiple linear regression. As presented in <xref ref-type="fig" rid="F4">Figure&#x20;4D</xref>, the results from the deep learning and regression are similar though the deep learning based method shows a little better performance when the penetration rate is less than 50%. This is because headways are generated from a distribution for the experiment, and both approaches find the best parameter values by minimizing error. At least in this experiment, there are no specific advantages to use the deep learning based method to estimate the flow rate from CAV data. However, the superiority of the deep learning based method will become clear in a real-world case, where we expect a more complicated relationship between the CAV headways and flow rate. The detailed results will be presented in Section <italic>Validation Results</italic>.</p>
<p>Lastly, it is notable that all methods improve in their performance in a nearly linear fashion as the CAV penetration rate increases; see <xref ref-type="fig" rid="F4">Figure&#x20;4D</xref>. However, the na&#xef;ve method improves more significantly though its RMSE values are much greater in low penetration. In high CAV penetration all methods perform well and about the same around at the penetration rate of 80%. Beyond 80%, however, the na&#xef;ve method and deep learning based method appear to perform better and improve faster than the Bayesian inference based method. This result underscores the limitation of the Bayesian process, in that prior information continues to influence the estimation even when a sufficient amount of real time data is available. Obviously, if the prior distribution is significantly different from the actual flow rate, it can actually hinder accurate estimation. We should note, however, that the performance of the Bayesian inference based method could vary depending on the available prior information and model structure. In this research, the prior information is defined as a distribution of historical flow rate, and it is applied in the same way to estimate flow rate regardless of the CAV penetration rate. If the penetration rate is sufficiently high, short-term past CAV data would serve as better prior information, or real-time CAV data could be weighted more than prior information. More studies are needed in the future to explore various cases in detail.</p>
</sec>
<sec id="s3-3">
<title>Effects of Traffic Demand on Flow Rate Estimation</title>
<p>This section investigates the effects of traffic demand on estimating the flow rate. To this end, we consider three demand scenarios and generate headway data sets similar to Section <italic>Numerical Experiment Set-Up</italic>. Specifically, we generated 1,000 data sets (including 100 headways for each) from an exponential distribution randomly with different mean of 3&#xa0;s (&#x3d; 1200&#xa0;veh/hr (low demand)), 2&#xa0;s (&#x3d; 1800&#xa0;veh/hr (medium demand)), and 1.5&#xa0;s (&#x3d; 2400&#xa0;veh/hr (high demand)) respectively. For each scenario, the flow rates are estimated by the three methods. For comparison, we compute the root mean square percentage error (RMSPE) for relative error as well as RMSE:<disp-formula id="e6">
<mml:math id="m52">
<mml:mrow>
<mml:mtext>RMSPE</mml:mtext>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>%</mml:mo>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>100</mml:mn>
<mml:mo>&#xd7;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
</p>
<p>
<xref ref-type="fig" rid="F6">Figure&#x20;6</xref> presents the RMSEs (left column) and RMSPEs (right column) for each scenario. For the na&#xef;ve method, the RMSEs increase with the demand, but the relative values, RMSPE, significantly decrease with the demand increasing, especially at a low CAV penetration rate. For example, when CAV rate is 10%, the RMSPE value decreases from 38.4% (low demand) to 22.4% (high demand). This result is expected since headways in higher demand have lower deviations due to less random vehicle arrivals, and thus, a partial headway sample can represent the traffic flow rate better. This trend is also observed in the Bayesian and deep learning based methods. When the demand is high, the two data-driven methods have low RMSEs (less than 100&#xa0;veh/hr) and RMSPE (less than 4.0%). The results clearly indicate that the accuracy of flow estimation is affected significantly by the demand&#x20;level.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>RMSE and RMSPE for different traffic demand; <bold>(A)</bold> RMSE for low demand; <bold>(B)</bold> RMSPE for low demand; <bold>(C)</bold> RMSE for medium demand; <bold>(D)</bold> RMSPE for medium demand; <bold>(E)</bold> RMSE for high demand; <bold>(F)</bold> RMSPE for high demand.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g006.tif"/>
</fig>
</sec>
<sec id="s3-4">
<title>Effects of Prior Distribution on Flow Rate Estimation</title>
<p>As presented in Section <italic>Overall Results and Findings</italic> (with <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>), prior information is essential for the Bayesian inference based method. Here, we conduct an additional experiment to examine the effect of prior distribution on the flow rate estimation. Specifically, we consider three different gamma distributions as prior distributions with the same mean of 2,000&#xa0;veh/hr but different deviations of 200, 500, and 800&#xa0;veh/hr (referred to as small, medium, and large deviations hereafter). Thus, the [shape, scale] for each Gamma distribution are [100, 20], [16, 125] and [6.25, 320] respectively. Notably, the small deviation represents the case that historical flow rates are similar whereas the large deviation represents a wide variation in historical flow rates. <xref ref-type="fig" rid="F7">Figure&#x20;7</xref> presents RMSEs of flow rate estimation with different prior distributions. Note that the (blue) line with triangular markers is the same as the one in <xref ref-type="fig" rid="F4">Figure&#x20;4D</xref> for the Bayesian inference based method. In low penetration (&#x3c;35%), RMSEs are similar for the cases of small deviation and medium deviation. However, as the penetration rate increases, the RMSE improves more slowly for the small deviation case. Evidently, the prior distribution with the small deviation has greater influence on the flow estimation and actually hinders the estimation when there is sufficient real-time information. One can see in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref> that a narrower prior distribution (with the same mean) would &#x201c;pull&#x201d; the posterior distribution closer to the prior distribution. On the other hand, the prior distribution with the large deviation does not provide much information when needed to estimate the flow rate at a low CAV penetration rate, contributing to relatively large RMSE values. However, the accuracy of flow estimation improves quickly as the real-time data becomes more available because the prior distribution has weak influence on the estimation process due to its large deviation. The results suggested that the Bayesian inference based method should be adopted with caution, considering the features of prior information and availability of real-time data (traffic demand, CAV penetration rate).</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>RMSE for different deviation of historical&#x20;data.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g007.tif"/>
</fig>
</sec>
<sec id="s3-5">
<title>Probability Distribution of Flow Rate From Bayesian Inference Based Method</title>
<p>One distinguishing feature of Bayesian inference is that it derives a flow rate distribution rather than a value, unlike the other methods. This means that we can use the mean or mode of the posterior distribution as a specific estimation, but also estimate the probability that the flow rate exceeds a certain value. This is a nice feature as it can be used to quantify the probability of traffic breakdown (<xref ref-type="bibr" rid="B7">Elefteriadou et&#x20;al., 1995</xref>; <xref ref-type="bibr" rid="B33">Persaud et&#x20;al., 1998</xref>; <xref ref-type="bibr" rid="B9">Evans et&#x20;al., 2001</xref>; <xref ref-type="bibr" rid="B4">Brilon et&#x20;al., 2005</xref>; <xref ref-type="bibr" rid="B41">Shiomi et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B5">Chen et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B13">Han and Ahn, 2018</xref>), which can be used for proactive control to prevent traffic breakdown. Thus, this feature is a notable advantage of the Bayesian inference method. For example, we consider a critical flow rate, <inline-formula id="inf47">
<mml:math id="m53">
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, at 2,200&#xa0;veh/hr and estimate the probability that the flow rate exceeds <inline-formula id="inf48">
<mml:math id="m54">
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> at different penetration rates, as presented in <xref ref-type="fig" rid="F8">Figures 8A&#x2013;J</xref>. The <italic>x</italic>-axis is the actual flow rate, and the <italic>y</italic>-axis shows the estimated probability that <inline-formula id="inf49">
<mml:math id="m55">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3e;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Assuming 0.5 as the critical probability to determine the accuracy of the estimation, the four quadrants (see <xref ref-type="fig" rid="F8">Figure&#x20;8A</xref>) represent different categories as: 1Q is &#x201c;Hit&#x201d; that <inline-formula id="inf50">
<mml:math id="m56">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3e;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> when <inline-formula id="inf51">
<mml:math id="m57">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, 2Q is the &#x201c;False Alarm&#x201d; that <inline-formula id="inf52">
<mml:math id="m58">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3e;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> when <inline-formula id="inf53">
<mml:math id="m59">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, 3Q is the &#x201c;Correct Rejection&#x201d; that <inline-formula id="inf54">
<mml:math id="m60">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> when <inline-formula id="inf55">
<mml:math id="m61">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and 4Q is the &#x201c;Miss&#x201d; hat <inline-formula id="inf56">
<mml:math id="m62">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>q</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> when <inline-formula id="inf57">
<mml:math id="m63">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. The rates of Hit and False Alarm are shown in <xref ref-type="fig" rid="F8">Figure&#x20;8K</xref>. The Hit rate increases with the CAV penetration rate while the False Alarm rate decreases.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>
<bold>(A)&#x2013;(J)</bold> Probability over critical flow rate through Bayesian inference; <bold>(K)</bold> Hit and False alarm&#x20;rates.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g008.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<title>Validation With Real Data</title>
<sec id="s4-1">
<title>Data and Assumptions</title>
<p>The proposed methods are validated with real data. We use the NGSIM prototype data (<xref ref-type="bibr" rid="B28">NGSIM, 2006</xref>) for a section of I-80 near the San Francisco Bay Area, CA. This freeway section is 3,000&#xa0;ft long and has six lanes, including a high-occupancy vehicle lane, and the data was collected for a 30&#xa0;min period in December 2003 at the resolution of 1/15 of a second. Note that the prototype NGSIM data includes both free flow and congested traffic states.</p>
<p>We divide the time-space domain into 450 subsections that are 100 feet by 2&#xa0;min. From the vehicle trajectories, we derive headway data at the midpoint of each subsection as shown earlier in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>, and calculate the actual flow rate for each subsection using all the headways. Then, we randomly designate &#x201c;CAVs&#x201d; considering the penetration rate and estimate a flow rate by each method using the CAV headway data. For the Deep-learning method, 315 subsections (70%) are used for model training, and 67 and 68 subsections (15% each) are used for validation and test, respectively.</p>
<p>For the Bayesian inference method, prior information is required; however, historical data at the NGSIM site is not available. Instead, we investigate the flow rate near the NGSIM site to observe its general characteristics over time. Specifically, we analyzed the data in 2004 through the Performance Measurement System (<xref ref-type="bibr" rid="B32">PeMS, 2018</xref>) at a detector location downstream of the NGSIM site<xref ref-type="fn" rid="FN4">
<sup>4</sup>
</xref>. We found that historic flow rates in that area are distributed in a typical bell-shaped curve, but the distribution varies by time of day, as illustrated in <xref ref-type="fig" rid="F9">Figure&#x20;9</xref>. This feature was also observed in the NGSIM data: the flow rate was similar throughout the site around the same time, but it changed over time as expected. Based on this observation, we assume that each time step (2&#xa0;min in this evaluation) has a prior distribution following a gamma distribution with a mean of the average flow rate (over all locations) at that time step in the NGSIM data. The deviation of the prior distribution is assumed relatively large at 500&#xa0;veh/hr to avoid the correlation between the data and the estimated prior distribution. Note that we obtained 15 prior distributions for the study duration, and each prior distribution applies to all locations. The likelihood function is used as exponential distribution as the most state is free flow state with random vehicle arrivals.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Example of distribution of historical flow rates (Detector &#x23; &#x3d; 400679 on I-80, CA, July-Dec, 2004).</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g009.tif"/>
</fig>
</sec>
<sec id="s4-2">
<title>Validation Results</title>
<p>
<xref ref-type="fig" rid="F10">Figure&#x20;10</xref> presents an example of the flow rate estimation results by each method with different CAV penetration rates. Similar to the numerical experiment, the na&#xef;ve method shows scattered results at a low penetration rate and a large value of RMSE, but the points gradually move to the reference line with smaller RMSE as the penetration rate increases. On the other hand, the Bayesian inference method estimates well even at low penetration rates, and the RMSE steadily decreases with increasing penetration rates. This could be due to the potentially close relationship between the actual flow rates and the assumed prior distributions. Thus, to apply the Bayesian inference, the prior information should represent a general traffic state of the target site. When the traffic condition changes significantly (e.g., a sudden demand increase), the prior distribution should be redefined. Lastly, the deep learning method shows better performance particularly at a low penetration rate. Notably, compared to the multiple linear regression, the deep learning based method clearly performs better with real data, demonstrating that the deep learning based method can better describe the relationship between the CAV headway and the flow&#x20;rate.</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Example of validation with NGSIM data (Lane 2, I-80) with different CAV penetration rate: <bold>(A)</bold> Na&#xef;ve method; <bold>(B)</bold> Bayesian inference; <bold>(C)</bold> Deep learning; <bold>(D)</bold> RMSE by penetration rate of CAV for each method.</p>
</caption>
<graphic xlink:href="ffutr-02-644988-g010.tif"/>
</fig>
</sec>
</sec>
<sec id="s5">
<title>Conclusion and Discussion</title>
<p>This paper presented flow rate estimation methods using headway data that can presumably be collected from CAVs. Specifically, we developed Bayesian inference and deep learning based methods and evaluated their performance against a baseline, na&#xef;ve method based on the simple arithmetic mean of headways. The proposed methods were investigated by numerical experiments and validated with real data. The results show that the Bayesian inference based method can be an effective algorithm to estimate flow rate distribution by integrating current (real-time) data and previous knowledge, such as historical data. It shows good performance (in terms of accuracy and precision) with a proper prior distribution and a likelihood function even at low penetration rates (&#x3c;20%). Thus, this method can be used when historical traffic information, consistent with the current traffic condition, is readily available. However, as the CAV penetration or demand increases, its relative advantage to the other methods (the deep learning based method and even the simple average) wanes because the prior information always influences the flow rate estimation. Particularly, in high CAV penetration, where real-time CAV information alone suffices for accurate flow estimation, inclusion of prior information can actually hinder the accuracy. The deep learning based method is found to perform reasonably well using only CAV data when the CAV penetration rate is moderate to high (&#x3e;20%). Particularly it shows superior performance in characterizing the complicated relationship in the real world than other methods considered in this study. However, when the data is sparse (in light traffic, low CAV penetration, or a small number of data), the method produces an estimate close to the mean of the training data regardless of real-time observations. Finally, at a relatively high CAV penetration rate (&#x3e;70%), the relative advantage of the advanced methods is negligible and in fact, the na&#xef;ve method is preferred in terms of accuracy as well as efficiency.</p>
<p>To improve the proposed methods, we suggest several future research directions. For the Bayesian inference based method, we mainly used the exponential-gamma conjugate system for the prior distribution and likelihood function for analytical tractability. Though these assumptions are reasonable to address general characteristics of free-flow traffic, more site-specific functions with calibration would be necessary to apply in practice. Furthermore, probabilistic distributions of CAVs should be considered to facilitate theoretical analysis.</p>
<p>For the deep learning based method, we have adopted this approach to better capture the complicated relationship between sampled headways and flow rate in free-flow traffic due to randomness in vehicle arrivals. Though the deep learning based method shows better performance than the other methods considered, particularly in real world estimation, it still has significant error in low CAV penetration. Its performance may improve if other factors, such as time of day, weather, historical traffic information, are considered as input features. In addition, due to the limitation of NGSIM data, the proposed deep learning based method is validated with a small dataset, which limits the applicability of this method. An improvement of this method may be possible with a larger dataset and a deeper architecture. Notably the proposed deep learning approach shows better performance than the na&#xef;ve method even though both methods use the same input data. However, considering other available data, advanced algorithms such as LSTM or Convolutional neural network should be considered to reveal hidden features in a larger dataset. In addition, this paper assumed that CAVs&#x2019; behavior is similar to the behavior of human-driven vehicles in a free flow state; however, CAVs&#x2019; behavior may be altered significantly in some situations due to advanced CAV operations (e.g., platooning, exclusive lane policy). Alternative methods should be developed in such cases. Finally, for the validation with real data, we used all observed data from the NGSIM vehicle trajectory data, some of which may be influenced by merging or lane-changing. Systematic data filtering is desirable in the future to further improve the model performance. Nonetheless, this study presents some insight into how advanced methods can be adopted to address challenges such as the one explored in this study and provides a building block for future studies.</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found here: ITS DataHub [<ext-link ext-link-type="uri" xlink:href="https://its.dot.gov/data/">https://its.dot.gov/data/</ext-link>] and Caltrans Performance Measurement System (PeMS) [<ext-link ext-link-type="uri" xlink:href="http://pems.dot.ca.gov">pems.dot.ca.gov</ext-link>].</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>YH: conceptualization, literature review, methodology, numerical experiment, validation, results analysis, results visualization, and manuscript-draft. SA: results analysis, manuscript-revision, and supervision.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>The authors gratefully acknowledge the National Science Foundation for sponsoring this research through Award CMMI 1536599.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<fn-group>
<fn id="FN1">
<label>1</label>
<p>CAV might measure the spacing of following vehicle as well. However, in this paper, we only consider data related leading vehicle since the detected rear range is typically shorter than front range. If the behind data is available, however, proposed method can be operated with more data, and the framework and features of proposed methods are&#x20;same.</p>
</fn>
<fn id="FN2">
<label>2</label>
<p>Density can be derived using spacing data of CAV through the same framework in following sections. But, for the Bayesian inference (in <italic>Bayesian Inference</italic>), enough prior knowledge and likelihood function of spacing for given density would be required.</p>
</fn>
<fn id="FN3">
<label>3</label>
<p>We also conducted the same experiment with 3,000 data set splitting into three groups of training, validation, and testing with 1,000 data set for each. The results are similar in terms of accuracy (in RMSE) and trends in the scatter&#x20;plot.</p>
</fn>
<fn id="FN4">
<label>4</label>
<p>The NGSIM prototype data was collected in December 2003. However, the quality of the PeMS data near the NGSIM site in 2003 was not desirable according to their data quality assessment. Therefore, we used the data in 2004 instead.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Beale</surname>
<given-names>M. H.</given-names>
</name>
<name>
<surname>Hagan</surname>
<given-names>M. T.</given-names>
</name>
<name>
<surname>Demuth</surname>
<given-names>H. B.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Neural network toolbox user&#x2019; s guide</source>. <publisher-loc>Boston, MA</publisher-loc>: <publisher-name>PWS</publisher-name>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bekiaris-Liberis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Roncoli</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Papageorgiou</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Highway traffic state estimation per lane in the presence of connected vehicles</article-title>. <source>Transportation Res. B: Methodological</source> <volume>106</volume>, <fpage>1</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1016/j.trb.2017.11.001</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bekiaris-Liberis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Roncoli</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Papageorgiou</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Highway traffic state estimation with mixed connected and conventional vehicles</article-title>. <source>IEEE Trans. Intell. Transport. Syst.</source> <volume>17</volume>, <fpage>3484</fpage>&#x2013;<lpage>3497</lpage>. <pub-id pub-id-type="doi">10.1109/TITS.2016.2552639</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Brilon</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Geistefeldt</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Regler</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2005</year>). &#x201c;<article-title>Reliability of freeway traffic flow</article-title>,&#x201d; in <conf-name>Proceedings 16th international symposium transportation and traffic theory</conf-name>, <conf-loc>College Park, MA</conf-loc>, <conf-date>July 19&#x2013;21, 2005</conf-date>, <fpage>125</fpage>&#x2013;<lpage>144</lpage>. <pub-id pub-id-type="doi">10.1016/b978-008044680-6/50009-x</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A traffic breakdown model based on queueing theory</article-title>. <source>Netw. Spat. Econ.</source> <volume>14</volume>, <fpage>485</fpage>&#x2013;<lpage>504</lpage>. <pub-id pub-id-type="doi">10.1007/s11067-014-9246-6</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Edie</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>1963</year>). &#x201c;<article-title>Discussion of traffic stream measurements and definitions</article-title>,&#x201d; in <conf-name>Proceedings of the second international symposium on the theory of traffic flow</conf-name>, <conf-loc>London</conf-loc>, <conf-date>June 1963</conf-date>, <fpage>139</fpage>&#x2013;<lpage>154</lpage>. </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elefteriadou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Roess</surname>
<given-names>R. P.</given-names>
</name>
<name>
<surname>McShane</surname>
<given-names>W. R.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Probabilistic nature of breakdown at freeway merge junctions</article-title>. <source>Transp. Res. Rec.</source> <volume>1484</volume>, <fpage>80</fpage>&#x2013;<lpage>89</lpage> . </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Elfar</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Xavier</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Talebpour</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Mahmassani</surname>
<given-names>H. S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Traffic shockwave detection in a connected environment using the speed distribution of individual vehicles</article-title>. <source>Transportation Res. Rec.</source> <volume>2672</volume>, <fpage>203</fpage>&#x2013;<lpage>214</lpage>. <pub-id pub-id-type="doi">10.1177/0361198118794717</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Evans</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<name>
<surname>Elefteriadou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gautam</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Probability of breakdown at freeway merges using Markov chains</article-title>. <source>Transportation Res. Part B: Methodological</source> <volume>35</volume>, <fpage>237</fpage>&#x2013;<lpage>254</lpage>. <pub-id pub-id-type="doi">10.1016/S0191-2615(99)00049-1</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fei</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>C.-C.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>A bayesian dynamic linear model approach for real-time short-term freeway travel time prediction</article-title>. <source>Transportation Res. C: Emerging Tech.</source> <volume>19</volume>, <fpage>1306</fpage>&#x2013;<lpage>1318</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2010.10.005</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fountoulakis</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Bekiaris-Liberis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Roncoli</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Papamichail</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Papageorgiou</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Highway traffic state estimation with mixed connected and conventional vehicles: microscopic simulation-based testing</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>78</volume>, <fpage>13</fpage>&#x2013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2017.02.015</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fusco</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Colombaroni</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Isaenko</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Short-term speed predictions exploiting big data on large urban road networks</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>73</volume>, <fpage>183</fpage>&#x2013;<lpage>201</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2016.10.019</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Ahn</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Stochastic modeling of breakdown at freeway merge bottleneck and traffic control method using connected automated vehicle</article-title>. <source>Transportation Res. Part B: Methodological</source> <volume>107</volume>, <fpage>146</fpage>&#x2013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1016/j.trb.2017.11.007</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Ahn</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Variable speed limit control at fixed freeway bottlenecks using connected vehicles</article-title>. <source>Transportation Res. Part B: Methodological</source> <volume>98</volume>, <fpage>113</fpage>&#x2013;<lpage>134</lpage>. <pub-id pub-id-type="doi">10.1016/j.trb.2016.12.013</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hegyi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Netten</surname>
<given-names>B. D.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Schakel</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Schreiter</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Yuan</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2013</year>). &#x201c;<article-title>A cooperative system based variable speed limit control algorithm against jam waves&#x2014;an extension of the SPECIALIST algorithm</article-title>,&#x201d; in <conf-name>16th International IEEE Conference on Intelligent Transportation Systems</conf-name>, <conf-loc>Hague, Netherlands</conf-loc>, <conf-date>October 6&#x2013;9, 2013</conf-date>, <volume>2</volume>, <fpage>973</fpage>&#x2013;<lpage>978</lpage>. <pub-id pub-id-type="doi">10.1109/ITSC.2013.6728358</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hofleitner</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Herring</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Abbeel</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bayen</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Learning the dynamics of arterial traffic from probe data using a dynamic bayesian network</article-title>. <source>IEEE Trans. Intell. Transport. Syst.</source> <volume>13</volume>, <fpage>1679</fpage>&#x2013;<lpage>1693</lpage>. <pub-id pub-id-type="doi">10.1109/TITS.2012.2200474</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jintanakul</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Jayakrishnan</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Bayesian mixture model for estimating freeway travel time distributions from small probe samples from multiple days</article-title>. <source>Transportation Res. Rec.</source> <volume>2136</volume>, <fpage>37</fpage>&#x2013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.3141/2136-05</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Julio</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Giesen</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lizana</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Real-time prediction of bus travel speeds using traffic shockwaves and machine learning algorithms</article-title>. <source>Res. Transportation Econ.</source> <volume>59</volume>, <fpage>250</fpage>&#x2013;<lpage>257</lpage>. <pub-id pub-id-type="doi">10.1016/j.retrec.2016.07.019</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jun</surname>
<given-names>H. J.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J.&#x20;K.</given-names>
</name>
<name>
<surname>Bae</surname>
<given-names>C. H.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Deep leaning neural networks for determining replacement timing of steel water transmission pipes</article-title>,&#x201d; in <conf-name>International conference on control, artificial intelligence, robotics and optimization (ICCAIRO)</conf-name>, <conf-loc>Prague, Czech Republic</conf-loc>, <conf-date>May 20&#x2013;22, 2017</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>219</fpage>&#x2013;<lpage>225</lpage>. </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khodayari</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ghaffari</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kazemi</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Braunstingl</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>A modified car-following model based on a neural network model of the human driver effects</article-title>. <source>IEEE Trans. Syst. Man. Cybern. A.</source> <volume>42</volume>, <fpage>1440</fpage>&#x2013;<lpage>1449</lpage>. <pub-id pub-id-type="doi">10.1109/TSMCA.2012.2192262</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Diagnosis and prediction of traffic congestion on urban road networks using bayesian networks</article-title>. <source>Transportation Res. Rec.</source> <volume>2595</volume>, <fpage>108</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.3141/2595-12</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lefevre</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Carvalho</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Borrelli</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>A Learning-based framework for velocity control in autonomous driving</article-title>. <source>IEEE Trans. Automat. Sci. Eng.</source> <volume>13</volume>, <fpage>32</fpage>&#x2013;<lpage>42</lpage>. <pub-id pub-id-type="doi">10.1109/TASE.2015.2498192</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X. M</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Vehicle headway modeling and its inferences in macroscopic/microscopic traffic flow theory: a survey</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>76</volume>, <fpage>170</fpage>&#x2013;<lpage>188</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2017.01.007</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tao</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Long short-term memory neural network for traffic speed prediction using remote microwave sensor data</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>54</volume>, <fpage>187</fpage>&#x2013;<lpage>197</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2015.03.014</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mathew</surname>
<given-names>T. V.</given-names>
</name>
<name>
<surname>Ravishankar</surname>
<given-names>K. V. R.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Neural network based vehicle-following model for mixed traffic conditions</article-title>. <source>Eur. Transport</source> <volume>52</volume> (<issue>4</issue>), <fpage>1</fpage>&#x2013;<lpage>15</lpage>. </citation>
</ref>
<ref id="B26">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Motamedidehkordi</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Amini</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Hoffmann</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Busch</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Fitriyanti</surname>
<given-names>M. R.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Modeling tactical lane-change behavior for automated vehicles: a supervised machine learning approach</article-title>,&#x201d; in <conf-name>5th IEEE International conference on models and technologies for intelligent transportation systems</conf-name>, <conf-loc>Naples, Italy</conf-loc>, <conf-date>June, 2017</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>268</fpage>&#x2013;<lpage>273</lpage>. <pub-id pub-id-type="doi">10.1109/MTITS.2017.8005678</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neumann</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Bohnke</surname>
<given-names>P. L.</given-names>
</name>
<name>
<surname>Touko Tcheumadjeu</surname>
<given-names>L. C.</given-names>
</name>
</person-group> (<year>2013</year>). &#x201c;<article-title>Dynamic representation of the fundamental diagram via Bayesian networks for estimating traffic flows from probe vehicle data</article-title>,&#x201d; in <conf-name>16th IEEE conference.on intelligent transportation system ITSC</conf-name>. <conf-loc>Hague, Netherlands</conf-loc>, <conf-date>October 6&#x2013;9, 2013</conf-date>. <pub-id pub-id-type="doi">10.1109/ITSC.2013.6728501</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="web">
<collab>NGSIM</collab> (<year>2006</year>). <article-title>Next generation simulation</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm">https://ops.fhwa.dot.gov/trafficanalysistools/ngsim.htm</ext-link>
</comment> (<comment>Accessed September 10, 2002</comment>). </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ozguven</surname>
<given-names>E. E.</given-names>
</name>
<name>
<surname>Ozbay</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Nonparametric bayesian estimation of freeway capacity distribution from censored observations</article-title>. <source>Transportation Res. Rec.</source> <volume>2061</volume>, <fpage>20</fpage>&#x2013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.3141/2061-03</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Papadopoulou</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Roncoli</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Bekiaris-Liberis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Papamichail</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Papageorgiou</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Microscopic simulation-based validation of a per-lane traffic state estimation scheme for highways with connected vehicles</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>86</volume>, <fpage>441</fpage>&#x2013;<lpage>452</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2017.11.012</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Papathanasopoulou</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Antoniou</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Towards data-driven car-following models</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>55</volume>, <fpage>496</fpage>&#x2013;<lpage>509</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2015.02.016</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="web">
<collab>PeMS</collab> (<year>2018</year>). <article-title>Freeway performance measurement system</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="http://pems.dot.ca.gov/">http://pems.dot.ca.gov/</ext-link>
</comment> (<comment>Accessed July 7, 2018</comment>). </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Persaud</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Yagar</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Brownlee</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Exploration of the breakdown phenomenon in freeway traffic</article-title>. <source>Transportation Res. Rec.</source> <volume>1634</volume>, <fpage>64</fpage>&#x2013;<lpage>69</lpage>. <pub-id pub-id-type="doi">10.3141/1634-08</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Piotrowski</surname>
<given-names>A. P.</given-names>
</name>
<name>
<surname>Napiorkowski</surname>
<given-names>J.&#x20;J.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling</article-title>. <source>J.&#x20;Hydrol.</source> <volume>476</volume>, <fpage>97</fpage>&#x2013;<lpage>111</lpage>. <pub-id pub-id-type="doi">10.1016/j.jhydrol.2012.10.019</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Polson</surname>
<given-names>N. G.</given-names>
</name>
<name>
<surname>Sokolov</surname>
<given-names>V. O.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deep learning for short-term traffic flow prediction</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>79</volume>, <fpage>1</fpage>&#x2013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2017.02.024</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roncoli</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Papageorgiou</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Papamichail</surname>
<given-names>I.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Motorway traffic flow optimisation in presence of vehicle automation and communication systems</article-title>. <source>Comput. Methods Appl. Sci.</source> <volume>38</volume>, <fpage>1</fpage>&#x2013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-18320-6_1</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rumelhart</surname>
<given-names>D. E.</given-names>
</name>
<name>
<surname>Hinton</surname>
<given-names>G. E.</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>R. J.</given-names>
</name>
</person-group> (<year>1986</year>). <article-title>Learning representations by back-propagating errors</article-title>. <source>Nature</source> <volume>323</volume>, <fpage>533</fpage>&#x2013;<lpage>536</lpage>. <pub-id pub-id-type="doi">10.1038/323533a0</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seo</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Bayen</surname>
<given-names>A. M.</given-names>
</name>
<name>
<surname>Kusakabe</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Asakura</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Traffic state estimation on highway: a comprehensive survey</article-title>. <source>Annu. Rev. Control</source> <volume>43</volume>, <fpage>128</fpage>&#x2013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.1016/j.arcontrol.2017.03.005</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seo</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kusakabe</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Asakura</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Estimation of flow and density using probe vehicles with spacing measurement equipment</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>53</volume>, <fpage>134</fpage>&#x2013;<lpage>150</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2015.01.033</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Seo</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kusakabe</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Probe vehicle-based traffic state estimation method with spacing information and conservation law</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>59</volume>, <fpage>391</fpage>&#x2013;<lpage>403</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2015.05.019</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shiomi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yoshii</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kitamura</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>Platoon-based traffic flow model for estimating breakdown probability at single-lane expressway bottlenecks</article-title>. <source>Transportation Res. Part B: Methodological</source> <volume>45</volume>, <fpage>1314</fpage>&#x2013;<lpage>1330</lpage>. <pub-id pub-id-type="doi">10.1016/j.trb.2011.05.008</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Simonelli</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Bifulco</surname>
<given-names>G. N.</given-names>
</name>
<name>
<surname>Martinis</surname>
<given-names>V. D.</given-names>
</name>
<name>
<surname>Punzo</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Human-like adaptive cruise control systems through a learning machine approach</article-title>,&#x201d; in <source>Applications of soft computing</source>, <fpage>240</fpage>&#x2013;<lpage>249</lpage>. <pub-id pub-id-type="doi">10.1016/j.limno.2013.04.005</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wei</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Dolan</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<name>
<surname>Litkouhi</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2010</year>). &#x201c;<article-title>A learning-based autonomous driver: emulate human driver&#x2019;s intelligence in low-speed car following</article-title>,&#x201d; in <conf-name>The International Society for Optical Engineering</conf-name>, <conf-loc>Orlando, FL</conf-loc>, <conf-date>April 5&#x2013;9, 2010</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>76930L</fpage>. <pub-id pub-id-type="doi">10.1117/12.852413</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zheng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Suzuki</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Fujita</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Car-following behavior with instantaneous driver-vehicle reaction delay: a neural-network-based methodology</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>36</volume>, <fpage>339</fpage>&#x2013;<lpage>351</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2013.09.010</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Qu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>A recurrent neural network based microscopic car following model to predict traffic oscillation</article-title>. <source>Transportation Res. Part C: Emerging Tech.</source> <volume>84</volume>, <fpage>245</fpage>&#x2013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.1016/j.trc.2017.08.027</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>