Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Traub, Manuel, Becker, Frederic, Sauter, Adrian, Otte, Sebastian, Butz, Martin V.
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.10410
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913224188755968
author	Traub, Manuel Becker, Frederic Sauter, Adrian Otte, Sebastian Butz, Martin V.
author_facet	Traub, Manuel Becker, Frederic Sauter, Adrian Otte, Sebastian Butz, Martin V.
contents	Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separating rgb, mask, location, and depth information for each. The results reveal largely superior video decomposition performance in the MOVi datasets and in another established dataset collection targeting scene segmentation. The system's well-interpretable, compositional latent encodings may serve as a foundation model for downstream tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2310_10410
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Loci-Segmented: Improving Scene Segmentation Learning Traub, Manuel Becker, Frederic Sauter, Adrian Otte, Sebastian Butz, Martin V. Computer Vision and Pattern Recognition Current slot-oriented approaches for compositional scene segmentation from images and videos rely on provided background information or slot assignments. We present a segmented location and identity tracking system, Loci-Segmented (Loci-s), which does not require either of this information. It learns to dynamically segment scenes into interpretable background and slot-based object encodings, separating rgb, mask, location, and depth information for each. The results reveal largely superior video decomposition performance in the MOVi datasets and in another established dataset collection targeting scene segmentation. The system's well-interpretable, compositional latent encodings may serve as a foundation model for downstream tasks.
title	Loci-Segmented: Improving Scene Segmentation Learning
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2310.10410

Similar Items