Analysing Foreground Segmentation in Deep Learning Based Depth Estimation on Free-Viewpoint Video Systems

Bookmark (0)
Please login to bookmark Close

Volumetric video acquisition systems enable realistic virtual experiences such as Free-Viewpoint Video (FVV). Stereo matching is a well known way of obtaining this volumetric information as depth images, calculating the disparity between two stereo color images. On these applications, the background of the scene captured is static and does not change, so foreground information is much more valuable. We propose adding foreground segmentation to help learning based algorithms, such as deep learning models, improve results previously obtained. We utilized the framework Detectron2 to model foreground segmentation by detecting people. Additionally, we built a large stereo dataset focused on FVV systems. Finally, we modified a successful deep learning model from the state-of-the-art, CREStereo, to add foreground segmentation and performed supervised training on it to estimate disparity, obtaining promising results.

​Volumetric video acquisition systems enable realistic virtual experiences such as Free-Viewpoint Video (FVV). Stereo matching is a well known way of obtaining this volumetric information as depth images, calculating the disparity between two stereo color images. On these applications, the background of the scene captured is static and does not change, so foreground information is much more valuable. We propose adding foreground segmentation to help learning based algorithms, such as deep learning models, improve results previously obtained. We utilized the framework Detectron2 to model foreground segmentation by detecting people. Additionally, we built a large stereo dataset focused on FVV systems. Finally, we modified a successful deep learning model from the state-of-the-art, CREStereo, to add foreground segmentation and performed supervised training on it to estimate disparity, obtaining promising results. Read More