ORIGINAL RESEARCH article

Front. Robot. AI

Sec. Robot Vision and Artificial Perception

Volume 12 - 2025 | doi: 10.3389/frobt.2025.1560342

This article is part of the Research TopicVision AI in Robotic Perception and MappingView all articles

Metric Scale Non-fixed Obstacles Distance Estimation by Using 3D Map and Monocular Camera

Provisionally accepted
  • Meijo University, Nagoya, Japan

The final, formatted version of the article will be published soon.

Obstacle avoidance is important for autonomous driving. Metric scale obstacle detection by using a monocular camera for obstacle avoidance has been studied. In this study, metric scale obstacle detection means detecting obstacles and measuring the distance to them with a metric scale. We have already developed PMOD-Net, which realizes metric scale obstacle detection by using a monocular camera and 3D map for autonomous driving. However, PMOD-Net's distance error of non-fixed obstacles which do not exist on 3D map is large. Accordingly, this study deals with the problem of improving distance estimation of non-fixed obstacles for obstacle avoidance.To solve the problem, we focused on the fact that PMOD-Net simultaneously performed object detection and distance estimation. We have developed a new loss function called DifSeg . DifSeg is calculated from the distance estimation results on the non-fixed obstacle region which is defined based on the object detection results. Therefore, DifSeg makes PMOD-Net focus on non-fixed obstacles during training. We evaluated the effect of DifSeg by using CARLA simulator, KITTI, and an original indoor dataset. The evaluation results showed that the distance estimation accuracy was improved on all datasets. Especially in the case of KITTI, the distance estimation error of our method was 2.42m, which was 2.14m less than the latest monocular depth estimation method.

Keywords: Obstacle detection, Depth completion, Monocular depth estimation, 3D map, Semantic segmentation, Autonomous Driving

Received: 14 Jan 2025; Accepted: 19 May 2025.

Copyright: © 2025 Higashi, Fukuta and Tasaki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Tsuyoshi Tasaki, Meijo University, Nagoya, Japan

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.