Multi-Sensor Data Fusion for Accurate Traffic Speed and Travel Time Reconstruction

This paper studies the joint reconstruction of traffic speeds and travel times by fusing sparse sensor data. Raw speed data from inductive loop detectors and floating cars as well as travel time measurements are combined using different fusion techniques. A novel fusion approach is developed which extends existing speed reconstruction methods to integrate low-resolution travel time data. Several state-of-the-art methods and the novel approach are evaluated on their performance in reconstructing traffic speeds and travel times using various combinations of sensor data. Algorithms and sensor setups are evaluated with real loop detector, floating car and Bluetooth data collected during severe congestion on German freeway A9. Two main aspects are examined: (i) which algorithm provides the most accurate result depending on the used data and (ii) which type of sensor and which combination of sensors yields higher estimation accuracies. Results show that, overall, the novel approach applied to a combination of floating-car data and loop data provides the best speed and travel time accuracy. Furthermore, a fusion of sources improves the reconstruction quality in many, but not all cases. In particular, Bluetooth data only provide a benefit for reconstruction purposes if integrated distinctively.


Introduction
For various applications in traffic engineering, it is fundamental to know about the traffic conditions on a road stretch with high certainty and sufficient spatio-temporal accuracy.A complete representation of traffic conditions is especially crucial for understanding traffic flow, for the effectivity analysis of control measures and for training data-driven prediction models.In contrast to real-time or predictive state estimation, these applications are usually applied retrospectively.
The retrospective analysis often focuses on average vehicle speeds per time and space interval on a road since this provides benefits such as enabling the deduction of travel times for road users, providing jam tail warnings [1] aiming at the reduction of rear-end collisions at jam tails, etc.However, using current sensor technology, average vehicle speeds are not measured for all times and places on a road stretch.Rather, various types of sensors are available that provide traffic-related data at different times for different places.Raw sensor data must therefore be processed in order to determine an accurate reconstruction of traffic conditions.
Nowadays, several sensor technologies are in place that gather data, each coming with advantages and disadvantages when applied.Induction loops, that are buried in the road surface, provide very exact and reliable speed information but are mainly limited to few road stretches since the installation and maintenance costs are high.Floating-Car Data (FCD), also called probe data, are gathered from vehicles or smartphones that determine their position via Global Navigation Satellite Systems (GNSS) and report this position on a regular basis to a central server.Time and space differences allow for reconstructing the probe's speed profile on a road.FCD are available wherever traffic is flowing, but represent only a sub-sample of the whole fleet.With WiFi/Bluetooth (BT) sensor technology, the unique MAC address of a device that passes two neighboring stations is registered, allowing the derivation of the travel time and therefore the average speed of devices that pass two neighboring stations [2,3,4,5,6].BT installation is not expensive but -like FCD -the receivers do not collect information from all vehicles and additionally, since they are conceivably placed several kilometers apart from each other, the average speed can be less granular.
Measuring traffic conditions with various sensors offers a great opportunity to increase the accuracy of traffic state estimates.However, the mentioned differences and characteristics of each technology challenge the fusion of the sources.The aim of an advanced fusion method is to make use of all information hidden in the data and compute a combined result that outperforms estimates based on a single source.Additionally, a combination of satisfactorily precise sensor combinations which are available at lower costs might be a reasonable compromise for decision makers, so knowing these combinations would be beneficial to them.
Given various sensor technologies, and various algorithms to process collected data, it is difficult to decide, which technology one should adopt and which algorithm one should deploy.This paper seeks to support decision makers, practitioners and researchers in selecting the combination of sensor data and a reconstruction approach that provides the greatest benefit for their specific problem.Since a real-world application requires algorithms to cope with sparse and missing data, this paper studies approaches with high robustness that can be applied directly.Based on real data collected on a German freeway, various algorithms and combinations of sensor technology are evaluated.Results comprise the accuracy of reconstructed space-time speeds as well as the accuracy of deduced travel times.
The paper is structured as follows.Section 2 gives a literature review on the comparison of different traffic data detection systems and on information fusion approaches.Section 3 describes the study site and data that are used to evaluate subsequent approaches.In section 4, existing applicable fusion methods are briefly summarized.Subsection 4.2 describes the adaption of the Phase-Based Smoothing Method (PSM) to consider BT data in a distinct way.Section 5 presents the applied quality metrics and the obtained results applying the methods to varying sensor setups.The conclusion in 6 wraps up the results and provides potential further research directions.

State of the Art
Comparisons of different traffic detection technologies have been widely performed in the past.In [7], a comprehensive summary of available sensors and fusion techniques is given.The authors of [8] compared Bluetooth measurements and loop detector data in the Greater Toronto Area on a stretch of several kilometers.In [9], the authors describe an offline comparison between loop detectors and floating cars, determining which is able to detect a traffic incident earlier.
In [10], the authors statistically analyze the differences between loop detectors and floating car data in the area of Lille, France.
Additionally, different fusion techniques have been investigated.El Faouzi and Klein [11] give a survey of current data fusion techniques for intelligent transportation systems.In [12], they present three widely applied data fusion techniques and describe their relevance to Intelligent Transportation Systems (ITS): Bayesian inference, Dempster-Shafer evidential reasoning, and Kalman filtering.In [13], an evidence-theory-based data fusion approach for traffic incident detection is described.Data from inductive detectors, camera observation and floating car data are fused on a rather short stretch of a few hundred meters on an urban highway.The authors of [14] applied data fusion techniques for traffic planning and control in a setting with satellite images, acoustic and GPS data.In [15], the authors describe a real-time capable framework for the fusion of loop detector and GPS data.This framework is able to distinguish lanebased traffic states.The authors of [16] study the fusion of loop data and toll collection data using a Dempster-Shafer approach in order to get an improved travel time estimate.In [17], an approach to network-wide traffic state estimation combining loop detector and floating car data is presented.The authors of [1] developed a model to fuse FCD and loop detector data to forecast congestion fronts on a freeway.A comparison of two model-based approaches on filtering methods is conducted in [18].The results are confirmed using synthetic data from a simulation.Liu et al. [19] describe an extended Kalman filter method for freeway traffic state estimation fusing two data sources: wireless communication records and microwave sensor detections.Another Kalman filter based approach is given in [20].In [21], the authors discuss a data fusion approach for cellphone probes and fixed sensors, and give a sensitivity analysis on impact factors.The article [22] describes a data fusion for travel time estimation from toll collection stations and stationary vehicle detectors in Taiwan.Rostami et al. [23] propose a fusion of loop data and FCD at intersections to estimate queue lengths and outflows.In [24] and [25], also a fusion of loop data and FCD is described with the goal to approximate the Macroscopic Fundamental Diagram of urban networks.Bachmann et al. [26,27] compared seven fusion methods for traffic speeds and travel time estimations.One key finding is that a simple convex combination of loop detectors and BT measurements is one of the best fusion strategies.However, data stems from micro-simulations which tend to idealize real data.The authors of [28] fuse various sensors, loops, FCD and camera data using the Adaptive Smoothing Method (ASM) to improve the speed of jam detection and respective control measures.An evaluation is performed using simulated data.
The mentioned studies are mostly limited to the usage of two different sensors, which limits the applicability in many scenarios.Furthermore, they mostly consider only one quality metric that is investigated, e.g. the spatio-temporal speed distribution or travel time.Some of the mentioned studies focus on the estimation of traffic conditions in dense networks, which is a different challenge than the one emphasized in this paper.Furthermore, data are often derived from micro-simulations, which allow for extensive studies but result in data that are usually more homogeneous and less noisy than real data.If studies utilize empirical data, they often focus on a rather short road stretch which gives an insight into only that specific freeway section.
The approach described in this paper is based on empirical data collected via three common sensor technologies: loop detectors, FCD and low-frequency travel time data from BT devices on a long stretch of a German freeway.The number of data points is large, which allows a detailed study of all combinations of data as well as several algorithms processing the data.Furthermore, this work applies two metrics which provides insights into the accuracy of both reconstructed traffic speeds and reconstructed travel times.The algorithms compared in this study, are state-of-the-art methods such as the ASM, the PSM, simple averaging methods and an extension of the PSM.This extension is a minor, but effective change to the PSM, which allows for the integration of low-frequency travel time data in order to achieve higher reconstruction accuracies.

Notation & Data
Speed measurements for all sensors are considered on a road stretch with length X and a time period T .The data of all detection technologies are represented as spatio-temporally discrete speed values in a uniform grid with step size ∆X = 100m and ∆T = 60s.Thus, the domain can be represented as a matrix with n X rows and n T columns, where an entry (also called cell in the following) is referred to as (i, j), where i = 1, . . ., n T and j = 1, . . ., n X .In each cell, the speed value is constant per data source and is denoted as v i,j .Given a set S of sensor technologies on the considered road stretch, V s , s ∈ S with S = {F CD, LOOP, BT } denote the speed matrices of FCD, loop detectors and BT sensors, respectively.
FCD comprise trajectories of vehicles.A trajectory of one vehicle contains all information that a vehicle, equipped with a GNSS, collects about its space-time speed.An equipped vehicle samples its current position x at time t with a certain frequency, and thus generates tuples of (t, x), t ∈ [0, T ], x ∈ [0, X] along the road stretch.Since no further speed information is given, for simplicity, the vehicle's speed between two sampled positions is assumed to be constant.With sampling frequencies, that are in the same order of magnitude as the time discretization of the domain, this basic assumption is sufficient.In order to turn the piece-wise linearly interpolated vehicle position into grid speeds, for each grid cell which is passed by the vehicle, i.e. the vehicle traveled ∆x i,j : ∆x i,j ≥ 0 m and ∆t i,j > 0 s in that cell, a cell-wise speed is computed as v i,j = ∆x i,j /∆t i,j .All cell-wise speeds of all traces are computed, and subsequently, the speeds of all traces are aggregated.If there are multiple speeds for the same cell, the harmonic mean of all assigned speed values is considered.The respective output matrix comprising all speed data from all equipped vehicles is denominated as Speeds measured by loop detectors are given at discrete positions along the road stretch, and with a temporal resolution of one minute.For each loop detector, the measured speeds are assigned to corresponding cells in the grid.The given name is V LOOP ∈ R n X ×n T .
Low-resolution travel times provided by BT are interpolated based on the Bluetooth Interpolation Algorithm [29].This method considers travel times through predefined cells and weights all crossing paths through any cell according to the share of the path inside the cell in order to obtain an averaged speed distribution The studies presented in the subsequent section are applied to data collected on May 29, 2019 on German autobahn A9 (Fig. 1) in the northbound direction during severe traffic congestion.The markers depict the positions of the loop detectors and Bluetooth receivers, respectively.FCD is collected from a fleet of cars which are equipped with a GNSS device.With sampling times between 5 s and 20 s, depending on the software version, the vehicle collects positions and timestamps.Packets of positions and timestamps are reported to a central server.In order to ensure privacy, the transmission ID is shuffled from time to time, and some packets are retained such that tracing a vehicle over its entire journey is not possible.
All in all, time-discrete data of 27 loop detectors, 11,722 BT samples and 1,578 FCD traces are available.Fig. 2 displays the raw data.

Fusion Methods
This section presents the fusion methods that are studied in this paper.Three considered state-of-the-art fusion methods are summarized and an extension to the PSM is presented.All methods investigated in the subsequent evaluation require to take as input only gridded speed data.That is necessary, as FCD and BT contain little information about flow or density.Furthermore, there must not exist requirements regarding minimum data coverage, e.g. a penetration rate of FCD or a minimum distance between neighboring detectors.That is necessary to ensure real-world applicability, where a sensor may fail, or where no equipped vehicles may pass a road segment for a longer period of time.Finally, the output of the algorithm must be a continuous speed estimate

ASM Approach
The ASM is a well-known approach used for traffic state reconstruction [30,31,32,33] and also for on-line traffic speed estimation [34].Briefly summarized, raw data of a sparse input source are smoothed in two traffic-characteristic directions: v cong denominating the wave speed in congested traffic conditions, and v f ree denominating the wave speed in free-flow conditions.In a discrete time-space domain, the resulting complete speed matrices V cong (t, x) and V f ree (t, x) The weight w(t, x) is adaptive and favors low speeds: with V thr a threshold where weight w(t, x) equals to 0.5 and ∆V a parameter to control the steepness of the weight function.In a theoretical analysis as well as evaluation with real data, van Lint et.al. pointed out that smoothing speeds yields a significant error considering travel time accuracy [35].Instead, they propose smoothing the inverted cell-wise speeds in order to reduce the error.Since travel time accuracy is one of the two key quality metrics in this evaluation, this procedure is adopted in this study, replacing the original formulation of the ASM.
Accordingly, for each data source, the discrete space-time matrices V ASM S ∈ R n X ×n T are computed.For a fusion, raw data are combined cell-wise and the combined raw data are processed with the ASM.In case of at least two data sources providing a speed for the same cell, the harmonic mean is taken.

PSM Approach
The PSM is an approach that is based on concepts of the ASM.It was developed to reconstruct space-time traffic speeds with higher accuracy given only FCD [36].It utilizes findings summarized by the Three-Phase traffic theory [37,38] in order to distinguish between localized and moving congestion.The method outperformed the ASM in a recent study [36] and is therefore included in the comparison as a state-of-the-art method.We refer to the original paper for a detailed method derivation and evaluation.
Briefly summarized, in the first step of the PSM, raw data are smoothed in the direction of typical speed propagation of each traffic phase.v cong is assumed to be the propagation speed of moving congestion with low vehicle speeds (also called Wide Moving Jams (WMJs) in the Three-Phase traffic theory).Congestion that is caused by a bottleneck, e.g. a construction site or an on-ramp, is often localized and its downstream front is attached to the bottleneck location.In order to account for the locality, data are smoothed only in temporal direction for the so-called synchronized traffic flow phase.Based on the speeds and the amount of available data, each cell (t, x) is classified into one of the three phases: Free flow, synchronized flow or WMJ using probability theory.
In the second step, phase-specific speed estimates are computed.Raw speed data that are assigned to a specific phase are smoothed using either a free-flow kernel parameterized with v f ree or a congested kernel parameterized with v cong .The phase-specific speed estimates are aggregated into a final speed estimate using a weighted average.
The input of the PSM are gridded speeds.Additionally, for each cell, a weight matrix w P SM ∈ R n X ×n T can be given as input to the method.Applying the PSM to the raw data of the input sources as well as their cell-wise combinations (see section 4.1), the respective output matrices V P SM S are computed.The weight w P SM is set to one for cells with valid data, and zero for cells without data.

Extended PSM Approach Considering Low-Frequency Probe Data
In order to apply the mentioned approaches, BT data are turned into cell-wise speeds by computing their mean speeds and assigning passed grid cells [29] (see Fig. 2).However, since the BT detectors are usually places several kilometers apart, taking the mean speed of a vehicle is a significant simplification of its real speed.For instance, if there was a mixture of congested and free traffic between two detector locations, a mean speed will smooth all details.For travel time estimations, this approach gives accurate results.In the case, that the space-time speed data is desired, the grid-wise cell speeds lack accuracy.Combining such smoothed speeds with other data sources which deliver more accurate information, will even worsen the resulting output, despite using more data.
Therefore, the idea, presented in this extension, is to introduce a dynamic weight that is assigned to gridded BT speed data, which express the trustworthiness of the computed grid speeds.The trustworthiness is influenced by the detector spacing and the measured travel time:  Assume a vehicle needs time ∆t to travel distance ∆x (see Fig. 3).It has a maximum speed of v max and a minimum speed of v min .Further assume that the vehicle is not standing, such that v min > 0.Then, for an observer who only measured ∆t and ∆x, it is not known where the vehicle was positioned, and at what speed it was driving, while passing the measured distance in measured time.From the observer's perspective, however, given the assumed minimum and maximum speed, the vehicle's position can be restricted to a certain space-time area.This area is depicted as a parallelogram in Fig. 3, along with three examples of potential vehicle trajectories.Each potential trajectory can be described as a function of the vehicle's position x(t), and its corresponding velocity v c (t).As illustrated, a medium travel time allows for strong deviations of v c (t) over time, whereas low travel times restrict v c (t) to higher speeds.Long travel times can only be realized with vehicle speeds close to v min .
A reconstruction method such as the PSM is sensitive to wrongly assigned speeds in cells.Therefore, given the chance that the vehicle had a completely different speed profile than the speed profile computed using a simple linear interpolation, the accuracy of the reconstruction suffers.In order to consider the probability of deviation in the reconstruction method, the following approach is implemented: The variety of potential trajectories is modeled as the space-time area A BT of the parallelogram formed by the time and space difference, and the assumed minimum and maximum vehicle speeds v min and v max .The magnitude of the area is supposed to affect the weight of a trace: If the area is large, indicating a great variety of potential trajectories, the weight shall be low.If the area is small, the number of potential trajectories is low and the weight shall be high.An exponential function is utilized to model the decay of the weight w A with increasing A w with γ ∈ R a parameter to adjust the sensitivity.The weight w(A) ∈ [0, 1] of all traces passing (t, x) assuming a linear interpolation is averaged and assigned to w BT .The novel fusion method is denominated as 'PSM-W' referring to a dedicated input source weighting of BT data.When combining the raw data of loops, FCD and BT, in this approach the weighted average of all speed cells is taken as input.Raw loop data and FCD are assigned a constant weight of one, and said w BT as weight for BT data.

Section Average
The 'section-average' approach averages collected data in predefined sections.Due to its simplicity, it is still applied in practice and, thus, considered as a relevant approach in this comparison.For each data source, time-space sections are defined and all data that are related to such a section are collected and averaged.Specifically, for loop detectors, section borders are located in the center of two adjacent detector positions.A cell is assigned the speed measurement that is collected by the spatially closest detector at the same moment in time.If, due to an outage of a detector, a measurement is missing, the next closest measurement in time is taken.The resulting speed matrix is denominated as V SEC LOOP .Start and end times of BT samples are collected at the locations of the BT detectors.For each section and each time step ∆t, all BT traces that cross such a section are identified.The total distance covered by these traces in this section for ∆t divided by the respective total time of all traces in this section is the resulting average speed at time t for all cells that belong to the section.The resulting speed estimate is denominated as V SEC BT .The same approach is done for FCD.Compared to stationary detectors, there are no predefined sections.For simplification, the same sections as for detector data are used.In order to assign values to sections without data, a temporal linear interpolation is performed.The resulting matrix is called V SEC F CD .Fusions of mutual pairs and all three matrices are simple cell-wise averages of the speeds.

Methodology
The aim of an accurate reconstruction method is to generate a complete speed estimate in time and space that is suited to various subsequent applications.Conventionally, the quality of a reconstruction is assessed using speed data only.The drawback is that a potential bias in estimated speeds, e.g. a systematic over-estimation, is not penalized.As a result, estimated travel times over larger segments are erroneous.Therefore, we see it necessary to assess both the accuracy of cell-wise speed estimates and the accuracy of virtual travel times.In the following, the combination of both aspects is considered as the reconstruction quality or accuracy.In order to assess the reconstruction accuracy, the following considerations are taken into account: (1) As visible in the raw data plots, the measurements of each data source are sparse in time and space.
(2) Loop detectors provide accurate speed measurements but are limited to certain locations.
(3) FCD provide relatively accurate speed estimates for varying times and spaces but do not represent macroscopic speeds.
(4) BT-based travel time measurements are abundant, though the cell-based speeds are inaccurate due to large distances between neighboring stations.
For these reasons, in order to assess the space-time speed, those data sources with high spatiotemporal accuracy should be used -for the evaluation of travel time data, a source with accurate travel time measurements is required.Therefore, a combination of FCD and loop detector data assesses the cell-wise speed estimates, and BT data are used to assess the travel time accuracy.A commonly used approach in model training and evaluation is to divide available data into a training and a test data set.Fig. 4 depicts the methodology applied in this evaluation.First, each data source is randomly divided into a training and test set with a ratio of 50:50.Specifically, all speed measurements that are gathered by one detector position are either assigned to training data or test data.FCD and BT are assigned per trace.Training data are fused in order to generate an estimate V E , and test data of FCD, loop detectors and BT are used to assess the reconstruction quality.
The quality assessment with a combination of FCD and loop data is done using the Inverse Mean Average Error (IMAE), eq. ( 4).It is a symmetric metric that is sensitive to deviations of lower speeds: with v test representing all tuples v i,j that correspond to a cell-wise speed contained in the test set v test .The set is defined as the union of all cell-wise speeds in the test sets of FCD and loop data.
Quality assessment of travel times with BT is based on the comparison of virtual trajectories with the measured traces using BT detectors.For each measured trace, a virtual trajectory is computed that starts at the same time and location (t start , x start ) of the real trace.The virtual vehicle drives with the continuous representation of speed V E (t, x(t)) until reaching x end : Its virtual travel time is defined as: Given n BT as the number of BT travel time samples in the test set, T T i as the measured and V T T i , i = 1, ..., n BT as the virtual travel times, the Mean Absolute Percentage Error (MAPE) is applied as a quality metric.A relative metric reduces the effect of varying segment distances between neighboring BT receivers.
The parameter set for the ASM is taken in accordance with [31].The PSM is parameterized according to [36].Based on some experiments, γ is set to 500, 000 m • s.A formal sensitivity analysis and optimization is left for future work.v min is set to 5 km/h and v max is set to 130 km/h.The random split between test and training set is done at each run.In total, speed estimation for all scenarios and algorithms as well as quality assessment was done 50 times and average results are presented.

Results
This study intends to give insights into several aspects that come up considering a multi-sensor data fusion.In order to structure the outcomes, the results are examined with respect to two questions: 1. Given a certain sensor set-up on a road and several algorithms that can be applied to process raw data, which algorithm returns the most accurate results?
2. Given the freedom to choose between the three sources of sensor data, which data source or which combination yields best results?

Algorithm Assessment
Fig. 5 depicts the mean IMAE and MAPE of all scenarios and algorithms.Several observations can be made: 1.The available sensor data have a significant impact on the resulting errors for each algorithm.
2. The IMAE has a higher variance than the MAPE.
3. Some algorithms perform best with respect to the IMAE in a scenario but are outperformed with respect to the MAPE (e.g. with FCD only, 'PSM' has a lower IMAE but 'ASM' a lower MAPE).This shows that both quality metrics measure different properties of an algorithm.
4. The 'SEC-AVG' is the algorithm which results in the lowest accuracy, for IMAE as well as MAPE in most scenarios.Given only 'LOOP+BT', this algorithm has a slight advantage over the 'ASM' and 'PSM'.Still, the 'PSM-W' performs better.
5. The 'PSM-W' performs significantly better in IMAE and MAPE in all scenarios that involve BT data.
6. On average, the 'PSM-W' provides the best quality results.In a 'LOOP'-only scenario, the 'ASM' performs better.6 visualizes the estimation results of all algorithms as well as the IMAE with respect to all available data.It can be observed that the estimate computed with the section-average approach (a) results in large errors downstream of the heavy congestion at kilometer 522.Furthermore, the approach failed to reconstruct the moving jams that emerge after 3:30pm.The reconstructions given with ASM (b) and PSM (c) reveal a higher spatio-temporal accuracy.Though, even these approaches spatially overestimate the heavy congestion and are not very accurate at reconstructing the moving jams either.The main reason is that all BT data, with their low space-time accuracy in mid-range speeds (compare section 4.3) are smoothed, which blurs the fine structure of the congestion.
Applying the 'PSM-W' (d) with the adapted weighting of BT according to eq. ( 3) (see Fig. 7) overcomes this issue.Traces with medium travel times and those collected on long segments tend to have a lower weight.Thus, both the speed profile of the heavy congestion and that of the moving jams are reconstructed more precisely.The PDFs of 'SEC-AVG', 'ASM' and 'PSM' are similar to each other, and exhibit a wider distribution than the PDF corresponding to 'PSM-W'.This explains the lower resulting MAPE of the 'PSM-W'.

Sensor Setup Assessment
Suppose that one wishes to install an array of traffic sensors on a stretch of road for the purpose of providing accurate traffic speed information.In that case, it is relevant to know about the quality that a single sensor technology or a combination of sensor technologies may achieve.Fig. 9 shows, for each sensor combination, the lowest achieved IMAE and MAPE across all algorithms.Several observations can be made: 2. The usage of more technologies does not necessarily improve the reconstruction quality.
For example, 'LOOP+FCD+BT' is not the most accurate combination.
3. With respect to IMAE, BT provides the lowest accuracy.
4. With respect to MAPE, loops provide the lowest accuracy.
5. Using FCD or combinations with FCD increases both quality metrics significantly.
6.The integration of BT data improves the quality in some cases (MAPE: 'FCD+BT', 'LOOP+BT'), but worsens it in others (IMAE: 'FCD+BT') Apparently, loop and FCD is the best choice.However, if for instance FCD are not available, a combination of loop and BT data is able to provide more accurate results.Thus, these findings support in the decision process of setting up sensors on a road, or amending stationary data with FCD.

Discussion
The present study examines two major aspects of a multi-sensor data fusion: the reconstruction accuracy using different combinations of sensor data, and the accuracy applying different stateof-the-art algorithms (as well as a novel approach) to different sensor combinations.Additionally, the reconstruction accuracy is measured using two metrics.A welcome result of such a study would be a clear recommendation on which algorithm or data to use in general in order to obtain the most accurate estimates.However, as the comparison showed, the choice of metric has an influence on the most accurate approach and sensor combination.For example, adding BT data barely improved, and sometimes even worsened, the quality of the space-time speed reconstruction.On the other hand, the travel time accuracy of Figure 9: Lowest IMAE and MAPE using the most accurate reconstruction algorithm with respect to the available data source virtual trajectories improved by adding BT.The same is true for the choice of algorithm.If only loop data are given, the ASM performs best in IMAE and MAPE.Given other data, specially BT data, the weighted PSM-W performs best.Compared to the original PSM, its accuracy is the same or better, thus, it successfully extends this approach without compromises.Thus, as a result, depending on the desired speed and travel time accuracy, this study helps to pick the optimal sensor setup or algorithm, depending on the given situation.
Some factors which may have an impact on the results are set as fixed in this study, though they may vary in other applications.First, the penetration rate and sampling interval of FCD and the spacing of stationary detectors may vary.Secondly, the situation used for assessment in this paper is a mixture of two traffic patterns using the classification of the Three-Phase theory: mega-jam and General Pattern [37].These patterns cause large travel time losses, and thus, are especially important to reconstruct accurately.For further work, the study may be extended to further congestion patterns occurring on different days and roads.

Conclusion
This paper studies a multi-sensor data fusion for traffic speed and travel time reconstruction.Two aspects are analyzed: (1) Which is the most accurate algorithm depending on different combinations of data sources, and (2) which is the best performance one can achieve with a flexible sensor setup.Therefore, three state-of-the-art methods such as the ASM, the PSM and a simple averaging method, as well as a novel approach, are used to reconstruct the traffic speed and travel times given sparse data.
The novel approach extends the PSM.It introduces a variable weighting of BT measurements, depending on detector spacing and measured travel time, which expresses the trustworthiness of a measurement.The weighting allows for a dynamic integration of BT data with other data sources.
The mentioned questions are studied using empirical loop data, BT data and FCD collected during severe congestion on a German freeway.Data are divided into a reconstruction and a test set.Various combinations of algorithms and data are used to reconstruct the space-time traffic speed and the travel times.The error metrics IMAE and MAPE are used to assess the resulting reconstruction accuracies.
Key findings are that the novel approach outperforms the other algorithms in most of the cases.Furthermore, a combination of FCD and loop detector data provides the best overall results.The integration of Bluetooth data does not necessarily improve the reconstruction quality, depending on the error measure chosen.However, if no FCD are available, a combination of loop data and BT data is a better choice than only one source of data.
Next steps may include a mathematical optimization of the applied parameters and further studies on sensor spacings.Furthermore, the study could be extended to other locations and congestion patterns.

Figure 2 :
Figure 2: Raw speed data measurements provided by (a) loop detectors, (b) equipped vehicles and (c) Bluetooth receivers

)
Medium travel time (ii) Short travel time (iii) Long travel time

Figure 3 :
Figure 3: Different travel time measurements and the space of potentially realized trajectories that result in each travel time

Figure 4 :
Figure 4: Flow of information of test and training set of sensor data for fusion and quality assessment

Figure 5 :
Figure 5: Mean (a) IMAE and (b) MAPE of all runs with respect to the available sensor technology and the applied algorithm

Figure 6 :
Figure 6: Reconstructed speeds applying each algorithm (a) SEC-AVG, (b) ASM, (c) PSM, (d) PSM-W to the training data (on the left) and resulting IMAEs comparing the reconstructed speeds to all available data (right)

Figure 7 :
Figure 7: Resulting weight applying the speed-adaptive conversion of travel time samples

Figure 8 :
Figure 8: Approximated probability density function of relative errors comparing the travel times of virtual trajectories based on each algorithm with the measured travel times collected via BT devices