LATR: 3D Lane Detection from Monocular Images with Transformer

Luo, Yueru; Zheng, Chaoda; Yan, Xu; Kun, Tang; Zheng, Chao; Cui, Shuguang; Li, Zhen

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.04583 (cs)

[Submitted on 8 Aug 2023 (v1), last revised 20 Aug 2023 (this version, v2)]

Title:LATR: 3D Lane Detection from Monocular Images with Transformer

Authors:Yueru Luo, Chaoda Zheng, Xu Yan, Tang Kun, Chao Zheng, Shuguang Cui, Zhen Li

View PDF

Abstract:3D lane detection from monocular images is a fundamental yet challenging task in autonomous driving. Recent advances primarily rely on structural 3D surrogates (e.g., bird's eye view) built from front-view image features and camera parameters. However, the depth ambiguity in monocular images inevitably causes misalignment between the constructed surrogate feature map and the original image, posing a great challenge for accurate lane detection. To address the above issue, we present a novel LATR model, an end-to-end 3D lane detector that uses 3D-aware front-view features without transformed view representation. Specifically, LATR detects 3D lanes via cross-attention based on query and key-value pairs, constructed using our lane-aware query generator and dynamic 3D ground positional embedding. On the one hand, each query is generated based on 2D lane-aware features and adopts a hybrid embedding to enhance lane information. On the other hand, 3D space information is injected as positional embedding from an iteratively-updated 3D ground plane. LATR outperforms previous state-of-the-art methods on both synthetic Apollo, realistic OpenLane and ONCE-3DLanes by large margins (e.g., 11.4 gain in terms of F1 score on OpenLane). Code will be released at this https URL .

Comments:	Accepted by ICCV2023 (Oral)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2308.04583 [cs.CV]
	(or arXiv:2308.04583v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.04583

Submission history

From: Yueru Luo [view email]
[v1] Tue, 8 Aug 2023 21:08:42 UTC (2,596 KB)
[v2] Sun, 20 Aug 2023 13:31:54 UTC (2,595 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LATR: 3D Lane Detection from Monocular Images with Transformer

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LATR: 3D Lane Detection from Monocular Images with Transformer

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators