Benchmark Dataset for Evaluation of Range-Based People Tracker Classifiers in Mobile Robots

Álvarez-Aparicio, Claudia; Guerrero-Higueras, Ángel Manuel; Olivera, Maria Carmen Calvo; Rodríguez-Lera, Francisco J.; Martín, Francisco; Matellán, Vicente

doi:10.3389/fnbot.2017.00072

DATA REPORT article

Front. Neurorobot., 15 January 2018

Volume 11 - 2017 | https://doi.org/10.3389/fnbot.2017.00072

Benchmark Dataset for Evaluation of Range-Based People Tracker Classifiers in Mobile Robots

Claudia Álvarez-Aparicio¹

Ángel Manuel Guerrero-Higueras¹*

Maria Carmen Calvo Olivera¹

Francisco J. Rodríguez-Lera²

Francisco Martín³

Vicente Matellán¹

¹Grupo de Robótica, Universidad de León, León, Spain
²University of Luxembourg, Luxembourg, Luxembourg
³Robotics Lab, Universidad Rey Juan Carlos, Madrid, Spain

1. Introduction

Detecting and tracking people is a very useful skill for different systems, in particular, for improving navigation social for mobile robots, or to facilitate more socially acceptable robots. Many solutions in the literature try to solve this problem using a multimodal approach, typically vision and range sensors, as seen in Arras et al. (2012). Vision sensors are more expensive and are more likely to gather contradictory information. For this reason, systems based only on range sensors are desirable. Regarding the classifiers for processing sensor data, convolutional neural networks are becoming a very popular solution, according to Long et al. (2015).

Laser imaging detection and ranging (LIDAR) sensors are reliable and currently affordable range sensors that provide information about the environment at good rates (~20−30 Hz.) for dynamic environments. They are easy to process on-board in real-time because each scan consists an array of only a few 100 integers.

Usually mobile robots mount laser scanners in a low position (~30−50 cm from the ground) to detect dynamic obstacles. They are also used to construct occupancy maps and navigate. The information provided allows estimating the distance at precise angles (resolution of 0.5°). This means that objects such as table or chair legs, trunks of plants, etc., may be easily confused with legs of persons. It is also difficult to keep track of a particular person (i.e., a pair of legs) in a crowded environment because many obstructions can result.

Fitting neural networks requires a good training dataset. Collecting and organizing a training set requires time as well as domain-specific knowledge. There is a large collection of robotic datasets available from various mobile robots, vehicles, and handheld sensors, such as Repository of robotics and computer vision datasets¹ for Mobile Robot Programming Toolkit (MRPT). However, most datasets may not be suitable for training neural networks. This data report summarizes a benchmark dataset, which can be used to evaluate the performance of different approaches for detecting and tracking people by using LIDAR sensors. Information contained in the dataset is specially suitable for use as training data for neural network-based classifiers.

Data actually contained in the dataset allow evaluating two people trackers, both neural network-based: leg detector (LD), a widely used solution by the Robot Operating System (ROS) community, see Quigley et al. (2009); and a people-tracker tool developed by the Robotics Group at the University of Leon, known as PeTra.

The rest of this paper is organized as follows: Section 2 describes the systems and the environment where data were gathered. It also specifies the procedure and tools employed. Section 3 explains how the dataset can be accessed and enumerates some applications and limitations for the data included in the dataset.

2. Materials and Methods

The following section describes the materials (shown in Figure 1) used to gather data, which include: a certified study area, an autonomous robot with an on-board LIDAR sensor, and a real-time location system (RTLS) to obtain ground-truth data about person location. Recorded data include location estimates calculated by two people trackers, LD and PeTra, also described below. Finally, the recording procedure used to build the dataset is explained.

FIGURE 1

Figure 1. Materials used in the experiments. (A) Orbi-One robot. (B) Leon@Home Testbed plan. (C) Floor plan of the apartment. Red dots show the location of KIO anchors. Black numbered dots denote Orbi-One locations during the data gathering. (D) KIO anchors. (E) General view of the apartment. (F) Furniture in mock-up apartment.

2.1. Leon@Home Testbed

Data have been gathered at Leon@Home Testbed.² This is a Certified Testbed³ of the European Robotics league (ERL). Its main purpose is to benchmark service robots in a realistic home environment. Our testbed is made up of four parts, shown in Figure 1B: a mock-up apartment, a control zone with direct vision (glass wall) into the apartment, a small workshop, and a larger development zone, where researchers work.

Leon@Home Testbed is located on the second floor of the Módulo de Investigación en Cibernética (Building for Research in Cybernetics) on the Vegazana Campus of the University of León (Spain). The apartment is a single bedroom mock-up home built in an 8 m × 7 m space. Figure 1C shows a plan of the apartment. 60 cm high walls divide it into a kitchen, living room, bathroom, and bedroom. The furniture (Figures 1E,F) has been chosen to test different robot abilities. For instance, the kitchen cabinets all have different types of handles.

2.2. Orbi-One Robot

Orbi-One (Figure 1A) is an assistant robot manufactured by Robotnik.⁴ It has several sensors, among them, a RGBD camera, a LIDAR sensor, and an inertial unit. It can operate a manipulator arm attached to its torso and has a wheeled base for moving around the room. Orbi-One includes a wireless access point, which allows WiFi communications with other robots and computers.

The software to control the robot hardware is based on a ROS framework. ROS is basically a set of libraries for robotics similar to operating system services, providing hardware abstraction for sensors and actuators, low-level device control, and inter-process communication. Computation takes place in processes named Nodes, which can receive and send Messages. Nodes publish Messages into information buffers called Topics.

2.3. KIO RTLS

In order to acquire ground-truth data about person location in the study area, we need an RTLS for indoor environments. The KIO RTLS commercial solution by Eliko⁵ has been used. KIO is a precise RTLS for tracking any object in 2- or 3-dimensional space. The Ultra Wideband technology enables to micro-position objects through obstructions. KIO also works in non-line-of-sight conditions and both indoors and outdoors.

KIO comes in two main configurations. The Regular Cell configuration guarantees a reliable accuracy of ±30 cm, according to the manufacturer’s specifications. The Small Cell configuration is designed for location-critical applications and provides reliable ±5 cm accuracy, according to the manufacturer’s specifications. Calibration done by the authors of this paper on the mock-up apartment shows that the error is higher in some areas, and lower in others, but on average, the claims of the manufacturer are correct.

KIO calculates the position of a mobile transceiver, called a Tag. In order to do so, KIO uses radio beacons, called Anchors, distributed in known positions in the surroundings. Figure 1D shows a KIO anchor. KIO tags are the same size and must be placed on-board the tracking subject, in our case people. The red dots in Figure 1C show the location of the six anchors used in these experiments. They are placed on the ceiling. The distribution of the anchors has been chosen following the method shown in Guerrero-Higueras et al. (2017).

2.4. Leg Detector (LD)

LD is a ROS package, which takes messages published by a LIDAR sensor as input and uses a machine-learning-trained classifier to detect groups of laser readings as possible legs. The code is available in a public repository,⁶ but is unsupported at this time.

LD publishes the location for the individual legs. It can also attempt to pair the legs and publish their average as an estimate of where the center of a person is. LD may optionally also publish visualization marker messages to indicate where detections happened.

2.5. PeTra

PeTra is a person-tracker tool for detecting and tracking, developed by the Robotics Group at the University of León. The system is based on a Convolutional Neural Network (CNN) using a configuration based on the U-Net architecture by Ronneberger et al. (2015).

The system performs the following steps in real time:

1. First, the data provided by the LIDAR sensor are processed to build a two dimensional occupancy map centered around the robot. This occupancy map is represented as a binary matrix, where 1s denote positions where the LIDAR scan found an obstacle, and 0s denote positions where the LIDAR scan either went through without detecting any obstacle or did not go through that position.

2. Then, the occupancy map is relayed to the network as input data. The network produces a second occupancy map representing the zones where legs have been detected.

3. Finally, center of mass calculations return the location of persons. PeTra also publishes locations for the individual legs and Marker messages for visualization.

2.6. Recording Procedure

The data were gathered in 14 different situations. In all of them, Orbi-One stood still as one or more people, carrying a KIO tag, moved around it. Three different locations for Orbi-One were defined (see Figure 1C) resulting in 42 scenarios (14 situations × 3 Orbi-One locations). Figure 2 shows the 14 different recognition scenarios recorded. These scenarios have been chosen according to different situations that may occur in robotics competitions such as ERL⁷ or RoboCup.⁸

FIGURE 2

Figure 2. Recognition situations recorded.

A rosbag file⁹ was created for each scenario (except for situations 3, 12, and 13 where 3 rosbag files where recorded, for situation 4 where 4 rosbag files where recorded, and for situation 9 where 5 rosbag files where recorded), recording LIDAR sensor measurements, location estimates from PeTra and LD, locations from KIO RTLS, and other useful data. Specifically, the following data were included in the rosbag files:

• LIDAR sensor data. Data from LIDAR sensors are provided as ROS LaserScan Messages,¹⁰ which include, among other information, the following: acquisition time of the first ray in the scan, start/end angle of the scan, angular distance between measurements, and range data.

• PeTra location estimates provided as ROS PointStamped Messages,¹¹ which include a position [x, y, z] and a timestamp.

• Location estimates calculated by LD. It publishes data for individual legs (as ROS PositionMeasurementArray Messages¹²). It also attempts to pair the legs and publishes their average as an estimate of where the center of a person is as a ROS PositionMeasurement Message.¹³

• Locations provided by KIO RTLS also provided as ROS PointStamped Messages.

• Messages from /map, /odom, and /tf ROS topics, which include map information, odometry of the robot base, and transformation information, respectively.

3. Results and Discussion

As a result of applying the recording method explained above, a first version of the dataset has been released. The dataset is known as “Range-based people tracker classifiers Benchmark Dataset”¹⁴ (RRID:SCR_015743). Data can be accessed at a public repository.¹⁵ Further information can be found at the University of Leon Robotics group web site,¹⁶ including information about contents of the rosbag files available for each scenario: start date/time, duration, and size.

The data gathered may be used to evaluate the performance of LD and PeTra. In order to empirically decide which one offers the best results, persons’ estimates from both systems can be compared to the ground-truth data provided by KIO RTLS. The accuracy error of PeTra (e_PeTra) in a concrete instant of time can be calculated as the Euclidean distance between its location estimates (l_PeTra) and ground-truth locations provided by KIO (l_KIO). Equation (1) show e_PeTra calculation, where n is the number of dimensions considered. In this case, only two dimensions need to be considered, since a mobile robot moves on the ground.

e_{P e T r a} = d (l_{P e T r a}, l_{K I O}) = \sqrt{\sum_{i = 1}^{n} (l_{P e T r a_{i}} - l_{K I O_{i}})} .

(1)

The same can be done with the accuracy error of LD (e_LD), calculated as the Euclidean distance between its location estimates (l_LD) and ground-truth locations (l_KIO) as shown in Equation (2).

e_{L D} = d (l_{L D}, l_{K I O}) = \sqrt{\sum_{i = 1}^{n} (l_{L D_{i}} - l_{K I O_{i}})} .

(2)

Once it is known how to calculate e_PeTra and e_LD in a concrete instance, it is possible to observe their evolution with time to select the system, which works better.

Regarding the above, there are two important issues to deal with:

1. KIO, PeTra, and LD use their own coordinate origins to represent locations. In order to compare these locations, they ought to be represented using the same coordinate origins. This can be done by using the translation and rotation quaternions published at the/tf topic.

2. Each message published for the recorded topics has its own timestamp with nanosecond precision, so, comparing locations in a concrete instant of time may not be an easy task. A synchronization method is needed to compare measurements from different topics. Pandas library, as seen in McKinney (2011), may be useful in doing so. It has merge methods to combine measurements by nearest timestamp.

Although the dataset was initially built to evaluate the performance of PeTra, it is important to note that any other people-tracker could be evaluated. Rosbag files include LIDAR sensor measurements, which may be used by any other people-tracker using LIDAR sensor measurements as input. By comparing location estimates with the ground-truth data provided by KIO, it is possible to calculate the accuracy error. In addition, these results can be compared with LD results, which are the most popular people-tracker using LIDAR sensor measurements.

Author Contributions

PeTra is based on CÁ-A’s work for her end of degree assignment, which was proposed and supervised by VM. FM provided useful comments regarding the configuration of the neural network. MO carried out most of the data-gathering in León, supported by AG-H in León and FR-L in Luxembourg. AG-H and VM did the KIO calibration and drafted the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Authors would like to thank A. Yugueros and R. M. Redondo for their support during the data gathering.

Funding

The research described in this article has been partially funded by the Instituto Nacional de Ciberseguridad de España (INCIBE), under the Grant “Cybersecurity in Systems with Autonomous Capacities (Addendum 21),” Addendum to the framework agreement INCIBE–Universidad de León, 2016–2018; and by Junta de Castilla y León, under grant LE028P17.

Footnotes

References

Arras, K. O., Lau, B., Grzonka, S., Luber, M., Mozos, O. M., Meyer, D., et al. (2012). “Chapter range-based people detection and tracking for socially enabled service robots,” in Towards Service Robots for Everyday Environments. STAR 76 (Springer), 235–280.

Google Scholar

Guerrero-Higueras, Á. M., DeCastro-García, N., Rodríguez-Lera, F. J., and Matellán, V. (2017). Empirical analysis of cyber-attacks to an indoor real time localization system for autonomous robots. Comput. Secur. 70, 422–435. doi: 10.1016/j.cose.2017.06.013

CrossRef Full Text | Google Scholar

Long, J., Shelhamer, E., and Darrell, T. (2015). “Fully convolutional networks for semantic segmentation,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3431–3440.

Google Scholar

McKinney, W. (2010). “Data structures for statistical computing in python”, in Proceedings of the 9th Python in Science Conference, eds S. van der Walt and J. Millman, 51–56. Available at: http://conference.scipy.org/proceedings/scipy2010/mckinney.html

Google Scholar

Quigley, M., Conley, K., and Gerkey, B. (2009). “ROS: an open-source robot operating system,” in ICRA Workshop on Open Source Software, Vol. 3 (Kobe, Japan), 5–10.

Google Scholar

Ronneberger, O., Fischer, P., and Brox, T. (2015). “U-net: convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention (MICCAI), Volume 9351 of LNCS (Springer), 234–241.

Google Scholar

Keywords: people tracking, mobile robots, range-based location, LIDAR sensors, convolutional neural networks

Citation: Álvarez-Aparicio C, Guerrero-Higueras ÁM, Olivera MCC, Rodríguez-Lera FJ, Martín F and Matellán V (2018) Benchmark Dataset for Evaluation of Range-Based People Tracker Classifiers in Mobile Robots. Front. Neurorobot. 11:72. doi: 10.3389/fnbot.2017.00072

Received: 04 October 2017; Accepted: 20 December 2017;
Published: 15 January 2018

Edited by:

Hong Qiao, University of Chinese Academy of Sciences (UCAS), China

Reviewed by:

Maria Pateraki, Foundation for Research and Technology Hellas, Greece
Lior Shamir, Lawrence Technological University, United States

Copyright: © 2018 Álvarez-Aparicio, Guerrero-Higueras, Olivera, Rodríguez-Lera, Martín and Matellán. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ángel Manuel Guerrero-Higueras, YW0uZ3VlcnJlcm9AdW5pbGVvbi5lcw==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.