Investigating the Impact of Encoder Architectures and Batch Size on Depth Estimation through Semantic Consistency

  • Iqra Nosheen
  • , Talha Iqbal
  • , Ihsan Ullah
  • , Cathy Ennis
  • , Michael G. Madden

Research output: Contribution to journalConference articlepeer-review

Abstract

Traditional methods for depth estimation rely on supervised learning with resource-intensive LiDAR data. Virtual synthetic datasets provide a cost-effective alternative, but bridging the domain gap between synthetic and real-world data remains a significant challenge. In existing work, this gap is addressed through domain adaptation techniques, aligning the feature distributions of synthetic (source) and real-world (target) domains. Our study explores the efficacy of different encoder architectures (ResNet variants with 35, 50, 101, 101-with-attention, and 152 convolution layers) and two batch sizes (2 and 4) for the depth estimation task. Our experiments show that ResNet101 without and with attention mechanisms provide the best performance across 2 and 4 batch sizes, respectively, compared to the other models. Conversely, the deeper architecture considered, ResNet152, shows the lowest performance, indicating that increasing the network depth does not necessarily lead to improved results for depth estimation tasks. This study's findings provide valuable insights for developing more effective depth estimation algorithms, and it suggests future directions in hyperparameter optimization and semantic consistency modeling.

Original languageEnglish
Pages (from-to)134-137
Number of pages4
JournalIET Conference Proceedings
Volume2024
Issue number10
DOIs
Publication statusPublished - 2024
Event26th Irish Machine Vision and Image Processing Conference, IMVIP 2024 - Limerick, Ireland
Duration: 21 Aug 202423 Aug 2024

Keywords

  • Batch sizes
  • Depth Estimation
  • Encoder Architectures
  • Image translation
  • Semantic Consistency

Fingerprint

Dive into the research topics of 'Investigating the Impact of Encoder Architectures and Batch Size on Depth Estimation through Semantic Consistency'. Together they form a unique fingerprint.

Cite this