Self-supervised Monocular Depth Estimation: Let’s Talk About the Weather

Abstract

Current, self-supervised depth estimation architectures rely on clear and sunny weather scenes to train deep neural networks. However, in many locations, this assumption is too strong. For example in the UK (2021), 149 days consisted of rain. For these architectures to be effective in real-world applications, we must create models that can generalise to all weather conditions, times of the day and image qualities. Using a combination of computer graphics and generative models, one can augment existing sunny-weather data in a variety of ways that simulate adverse weather effects. While it is tempting to use such data augmentations for self-supervised depth, in the past this was shown to degrade performance instead of improving it. In this paper, we put forward a method that uses augmentations to remedy this problem. By exploiting the correspondence between unaugmented and augmented data we introduce a pseudo-supervised loss for both depth and pose estimation. This brings back some of the benefits of supervised learning while still not requiring any labels. We also make a series of practical recommendations which collectively offer a reliable, efficient framework for weather-related augmentation of self-supervised depth from monocular video. We present extensive testing to show that our method, Robust-Depth, achieves SotA performance on the KITTI dataset while significantly surpassing SotA on challenging, adverse condition data such as DrivingStereo, Foggy CityScape and NuScenes-Night.

Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies > Applied AI & Robotics
College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
College of Engineering & Physical Sciences > Aston Centre for Artifical Intelligence Research and Application
Aston University (General)
Additional Information: This ICCV paper is the Open Access version, provided by the Computer Vision Foundation. Except for this watermark, it is identical to the accepted version; the final published version of the proceedings is available on IEEE Xplore. Funding & Acknowledgements: This research was funded and supported by the EPSRC’s DTP, Grant EP/W524566/1. Most experiments were run on Aston EPS Machine Learning Server, funded by the EPSRC Core Equipment Fund, Grant EP/V036106/1.
Event Title: The 2023 International Conference on Computer Vision
Event Type: Other
Event Dates: 2023-10-02 - 2023-10-06
Last Modified: 12 Dec 2024 08:40
Date Deposited: 26 Oct 2023 10:29
Full Text Link: https://arxiv.o ... /abs/2307.08357
https://openacc ... 2023_paper.html
Related URLs:
PURE Output Type: Conference contribution
Published Date: 2023-10-06
Published Online Date: 2023-10-06
Accepted Date: 2023-05-31
Authors: Saunders, Kieran
Vogiatzis, George
Manso, Luis J. (ORCID Profile 0000-0003-2616-1120)

Download

[img]

Version: Accepted Version

| Preview

Export / Share Citation


Statistics

Additional statistics for this record