City Research Online

Depth-Enhanced Deep Learning Approach For Monocular Camera Based 3D Object Detection

Wang, C. & Aouf, N. ORCID: 0000-0001-9291-4077 (2024). Depth-Enhanced Deep Learning Approach For Monocular Camera Based 3D Object Detection. Journal of Intelligent & Robotic Systems, 110(3), article number 101. doi: 10.1007/s10846-024-02128-w

Abstract

Automatic 3D object detection using monocular cameras presents significant challenges in the context of autonomous driving. Precise labeling of 3D object scales requires accurate spatial information, which is difficult to obtain from a single image due to the inherent lack of depth information in monocular images, compared to LiDAR data. In this paper, we propose a novel approach to address this issue by enhancing deep neural networks with depth information for monocular 3D object detection. The proposed method comprises three key components: 1)Feature Enhancement Pyramid Module: We extend the conventional Feature Pyramid Networks (FPN) by introducing a feature enhancement pyramid network. This module fuses feature maps from the original pyramid and captures contextual correlations across multiple scales. To increase the connectivity between low-level and high-level features, additional pathways are incorporated. 2)Auxiliary Dense Depth Estimator: We introduce an auxiliary dense depth estimator that generates dense depth maps to enhance the spatial perception capabilities of the deep network model without adding computational burden. 3)Augmented Center Depth Regression: To aid center depth estimation, we employ additional bounding box vertex depth regression based on geometry. Our experimental results demonstrate the superiority of the proposed technique over existing competitive methods reported in the literature. The approach showcases remarkable performance improvements in monocular 3D object detection, making it a promising solution for autonomous driving applications.

Publication Type: Article
Additional Information: This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Publisher Keywords: 3D Object detection, Autonomous driving, Machine learning
Subjects: T Technology > TJ Mechanical engineering and machinery
Departments: School of Science & Technology
School of Science & Technology > Engineering
SWORD Depositor:
[thumbnail of s10846-024-02128-w.pdf]
Preview
Text - Published Version
Available under License Creative Commons: Attribution International Public License 4.0.

Download (2MB) | Preview

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login