TY - GEN
T1 - Enhancing Synthetic Image Realism with Controlled Diffusion Models
AU - Nosheen, Iqra
AU - Farooq, Muhammad Ali
AU - Corcoran, Peter
AU - Ennis, Cathy
AU - Madden, Michael G.
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - In this work, we present an innovative approach utilizing ControlNet-based diffusion models along with upscaling capabilities for domain adaptation and quality refinement of 3D modelled synthetic datasets, focusing on autonomous vehicle applications. A significant domain gap often exists between synthetic and real-world data, hindering the applicability of deep learning models trained on synthetic data for real-world scenarios. Our methodology leverages the strengths of Controlled Augmentation by simultaneously utilizing multiple ControlNet signals, including edge detection, depth information, segmentation maps, and tile resampling. To improve how synthetic data aligns with the desired domain specifications, these signals guide the generative process, and we also incorporate text-guided prompts extracted via Large Language Models (LLMs), to improve control over the synthesis of desired features and attributes. We test the approach on diverse environmental conditions from the VKITTI dataset, a well-known 3D modelled synthetic dataset generated in Unity for autonomous driving research. The refined data is validated using quantitative metrics including FID, SSIM, and LPIPS, and is also evaluated on downstream machine learning tasks of object detection and classification, using YOLO-v8 to ensure its utility and effectiveness. Experimental analysis demonstrates the effectiveness of this method in improving the realism and usability of synthetic data. Our approach contributes to fields that require high-quality data synthesis and domain adaptation. The experimental work, along with ControlNet models used in this project is available online.
AB - In this work, we present an innovative approach utilizing ControlNet-based diffusion models along with upscaling capabilities for domain adaptation and quality refinement of 3D modelled synthetic datasets, focusing on autonomous vehicle applications. A significant domain gap often exists between synthetic and real-world data, hindering the applicability of deep learning models trained on synthetic data for real-world scenarios. Our methodology leverages the strengths of Controlled Augmentation by simultaneously utilizing multiple ControlNet signals, including edge detection, depth information, segmentation maps, and tile resampling. To improve how synthetic data aligns with the desired domain specifications, these signals guide the generative process, and we also incorporate text-guided prompts extracted via Large Language Models (LLMs), to improve control over the synthesis of desired features and attributes. We test the approach on diverse environmental conditions from the VKITTI dataset, a well-known 3D modelled synthetic dataset generated in Unity for autonomous driving research. The refined data is validated using quantitative metrics including FID, SSIM, and LPIPS, and is also evaluated on downstream machine learning tasks of object detection and classification, using YOLO-v8 to ensure its utility and effectiveness. Experimental analysis demonstrates the effectiveness of this method in improving the realism and usability of synthetic data. Our approach contributes to fields that require high-quality data synthesis and domain adaptation. The experimental work, along with ControlNet models used in this project is available online.
KW - autonomous vehicles
KW - Diffusion models
KW - domain adaptation
KW - image realism
KW - synthetic data
UR - https://www.scopus.com/pages/publications/105023982254
U2 - 10.1109/IJCNN64981.2025.11228158
DO - 10.1109/IJCNN64981.2025.11228158
M3 - Conference contribution
AN - SCOPUS:105023982254
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - International Joint Conference on Neural Networks, IJCNN 2025 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 International Joint Conference on Neural Networks, IJCNN 2025
Y2 - 30 June 2025 through 5 July 2025
ER -