Point-voxel dual transformer for LiDAR 3D object detection
CSTR:
Author:
Affiliation:

1. Tianjin Key Laboratory for Control Theory & Applications in Complicated Systems and Intelligent Robot Laboratory, Tianjin University of Technology, Tianjin 300384, China;2. Department of Electrical Engineering, Tshwane University of Technology, Pretoria 0001, South Africa

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    In this paper, a two-stage light detection and ranging (LiDAR) three-dimensional (3D) object detection framework is presented, namely point-voxel dual transformer (PV-DT3D), which is a transformer-based method. In the proposed PV-DT3D, point-voxel fusion features are used for proposal refinement. Specifically, keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module. Subsequently, following the generation of proposals by the region proposal networks (RPN), the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture. In 3D object detection, the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions. Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.

    Reference
    Related
    Cited by
Get Citation

TONG Jigang, YANG Fanhang, YANG Sen, DU Shengzhi. Point-voxel dual transformer for LiDAR 3D object detection[J]. Optoelectronics Letters,2025,(9):547-554

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 17,2023
  • Revised:March 02,2025
  • Adopted:
  • Online: August 21,2025
  • Published:
Article QR Code