Saved in:
Bibliographic Details
Main Author: Wang, Tinghuai
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.05913
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866917714996494336
author Wang, Tinghuai
author_facet Wang, Tinghuai
contents Learning a data-driven spatio-temporal semantic representation of the objects is the key to coherent and consistent labelling in video. This paper proposes to achieve semantic video object segmentation by learning a data-driven representation which captures the synergy of multiple instances from continuous frames. To prune the noisy detections, we exploit the rich information among multiple instances and select the discriminative and representative subset. This selection process is formulated as a facility location problem solved by maximising a submodular function. Our method retrieves the longer term contextual dependencies which underpins a robust semantic video object segmentation algorithm. We present extensive experiments on a challenging dataset that demonstrate the superior performance of our approach compared with the state-of-the-art methods.
format Preprint
id arxiv_https___arxiv_org_abs_2407_05913
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Submodular video object proposal selection for semantic object segmentation
Wang, Tinghuai
Computer Vision and Pattern Recognition
Learning a data-driven spatio-temporal semantic representation of the objects is the key to coherent and consistent labelling in video. This paper proposes to achieve semantic video object segmentation by learning a data-driven representation which captures the synergy of multiple instances from continuous frames. To prune the noisy detections, we exploit the rich information among multiple instances and select the discriminative and representative subset. This selection process is formulated as a facility location problem solved by maximising a submodular function. Our method retrieves the longer term contextual dependencies which underpins a robust semantic video object segmentation algorithm. We present extensive experiments on a challenging dataset that demonstrate the superior performance of our approach compared with the state-of-the-art methods.
title Submodular video object proposal selection for semantic object segmentation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2407.05913