Saved in:
| Main Authors: | Wang, Yihao, Miao, Yang, Zhao, Wenshuai, Yang, Wenyan, Wang, Zihan, Pajarinen, Joni, Van Gool, Luc, Paudel, Danda Pani, Kannala, Juho, Wang, Xi, Solin, Arno |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.25539 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Smoothing Slot Attention Iterations and Recurrences
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description
by: Halacheva, Anna-Maria, et al.
Published: (2024)
by: Halacheva, Anna-Maria, et al.
Published: (2024)
Sparsely Supervised Diffusion
by: Zhao, Wenshuai, et al.
Published: (2026)
by: Zhao, Wenshuai, et al.
Published: (2026)
Optimistic Multi-Agent Policy Gradient
by: Zhao, Wenshuai, et al.
Published: (2023)
by: Zhao, Wenshuai, et al.
Published: (2023)
Latent-Compressed Variational Autoencoder for Video Diffusion Models
by: Guan, Jiarui, et al.
Published: (2026)
by: Guan, Jiarui, et al.
Published: (2026)
Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion Models
by: Motamed, Saman, et al.
Published: (2023)
by: Motamed, Saman, et al.
Published: (2023)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation
by: Nachkov, Asen, et al.
Published: (2024)
by: Nachkov, Asen, et al.
Published: (2024)
EvenNICER-SLAM: Event-based Neural Implicit Encoding SLAM
by: Chen, Shi, et al.
Published: (2024)
by: Chen, Shi, et al.
Published: (2024)
Efficient Reinforcement Learning by Guiding Generalist World Models with Non-Curated Data
by: Zhao, Yi, et al.
Published: (2025)
by: Zhao, Yi, et al.
Published: (2025)
Multi-Scale Fusion for Object Representation
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Grouped Discrete Representation for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Vector-Quantized Vision Foundation Models for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Organized Grouped Discrete Representation for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Grouped Discrete Representation Guides Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
LangHOPS: Language Grounded Hierarchical Open-Vocabulary Part Segmentation
by: Miao, Yang, et al.
Published: (2025)
by: Miao, Yang, et al.
Published: (2025)
DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering
by: Wang, Yihao, et al.
Published: (2024)
by: Wang, Yihao, et al.
Published: (2024)
Point Tracking Improves World Action Models
by: Guan, Jiarui, et al.
Published: (2026)
by: Guan, Jiarui, et al.
Published: (2026)
Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
by: Mahdi, Mohammad, et al.
Published: (2025)
by: Mahdi, Mohammad, et al.
Published: (2025)
Autonomous Vehicle Path Planning by Searching With Differentiable Simulation
by: Nachkov, Asen, et al.
Published: (2025)
by: Nachkov, Asen, et al.
Published: (2025)
Predicting Video Slot Attention Queries from Random Slot-Feature Pairs
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Internalizing Temporal Consistency in Video Object-Centric Learning without Explicit Regularization
by: Zhao, Rongzhen, et al.
Published: (2026)
by: Zhao, Rongzhen, et al.
Published: (2026)
Slot Attention with Re-Initialization and Self-Distillation
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
by: Ma, Qi, et al.
Published: (2024)
by: Ma, Qi, et al.
Published: (2024)
Continuous Pose for Monocular Cameras in Neural Implicit Representation
by: Ma, Qi, et al.
Published: (2023)
by: Ma, Qi, et al.
Published: (2023)
From Synchrony to Sequence: Exo-to-Ego Generation via Interpolation
by: Mahdi, Mohammad, et al.
Published: (2026)
by: Mahdi, Mohammad, et al.
Published: (2026)
Vision encoders should be image size agnostic and task driven
by: Prisadnikov, Nedyalko, et al.
Published: (2025)
by: Prisadnikov, Nedyalko, et al.
Published: (2025)
Self-supervised pretraining for an iterative image size agnostic vision transformer
by: Prisadnikov, Nedyalko, et al.
Published: (2026)
by: Prisadnikov, Nedyalko, et al.
Published: (2026)
A Simple and Generalist Approach for Panoptic Segmentation
by: Prisadnikov, Nedyalko, et al.
Published: (2024)
by: Prisadnikov, Nedyalko, et al.
Published: (2024)
Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
by: Fu, Yuqian, et al.
Published: (2025)
by: Fu, Yuqian, et al.
Published: (2025)
Exploration-Driven Generative Interactive Environments
by: Savov, Nedko, et al.
Published: (2025)
by: Savov, Nedko, et al.
Published: (2025)
GaussianVLM: Scene-centric 3D Vision-Language Models using Language-aligned Gaussian Splats for Embodied Reasoning and Beyond
by: Halacheva, Anna-Maria, et al.
Published: (2025)
by: Halacheva, Anna-Maria, et al.
Published: (2025)
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
by: Savov, Nedko, et al.
Published: (2025)
by: Savov, Nedko, et al.
Published: (2025)
Reachability Weighted Offline Goal-conditioned Resampling
by: Yang, Wenyan, et al.
Published: (2025)
by: Yang, Wenyan, et al.
Published: (2025)
Sources of Uncertainty in 3D Scene Reconstruction
by: Klasson, Marcus, et al.
Published: (2024)
by: Klasson, Marcus, et al.
Published: (2024)
Inferring Compositional 4D Scenes without Ever Seeing One
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits
by: Balauca, Ada-Astrid, et al.
Published: (2024)
by: Balauca, Ada-Astrid, et al.
Published: (2024)
RICO: Two Realistic Benchmarks and an In-Depth Analysis for Incremental Learning in Object Detection
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
Incremental Object Detection with Prompt-based Methods
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
by: Cheng, Jiahuan, et al.
Published: (2024)
by: Cheng, Jiahuan, et al.
Published: (2024)
EgoSpot:Egocentric Multimodal Control for Hands-Free Mobile Manipulation
by: Zhang, Ganlin, et al.
Published: (2023)
by: Zhang, Ganlin, et al.
Published: (2023)
Similar Items
-
Smoothing Slot Attention Iterations and Recurrences
by: Zhao, Rongzhen, et al.
Published: (2025) -
Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description
by: Halacheva, Anna-Maria, et al.
Published: (2024) -
Sparsely Supervised Diffusion
by: Zhao, Wenshuai, et al.
Published: (2026) -
Optimistic Multi-Agent Policy Gradient
by: Zhao, Wenshuai, et al.
Published: (2023) -
Latent-Compressed Variational Autoencoder for Video Diffusion Models
by: Guan, Jiarui, et al.
Published: (2026)