Dual-Domain Fusion and Dynamic Residual Learning for Multi-Scale High-Resolution Semantic Segmentation of Urban Scenes
DOI:
CSTR:
Author:
Affiliation:

1.Xi'2.'3.an University of Posts and Telecommunications

Clc Number:

Fund Project:

This work has been supported by the Key Research and Development Program of Shaanxi Province (No. 2025GH-YBXM-021)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To address the challenges of insufficient detail depiction, blurred boundaries, and mis-segmentation in UAV-based urban scene segmentation, we propose a semantic segmentation framework, ETP-HRNet. This method incorporates an Edge-guided Dual-domain Feature Fusion Structure (EDFS), which integrates spatial and frequency information to enhance edge detail perception, thereby improving the segmentation accuracy of boundary regions. The Tri-Dimensional Dynamic Residual Module (TDRM) was designed, which expands the receptive field and enriches semantic representations through multidimensional convolution and a dynamic selection mechanism, mitigating intra-class feature inconsistencies. Meanwhile, a Pyramid Feature Aggregation Structure (PFAS) is designed to fa-cilitate efficient cross-scale feature integration, enhancing the model’s ability to capture multi-scale contextual information. Experimental results indicate that ETP-HRNet achieves a mIoU of 70.04% on the UAVid dataset and 74.61% on the UDD dataset. Notably, it improved the static car category by 8.03% on the UAVid dataset and the road category by 5.39% on the UDD dataset, with effectiveness in fine-grained segmentation and semantic con-sistency.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 14,2025
  • Revised:September 11,2025
  • Adopted:October 22,2025
  • Online:
  • Published:
Article QR Code