Saved in:
| Main Authors: | Zhao, Rongzhen, Yang, Wenyan, Kannala, Juho, Pajarinen, Joni |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.05417 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Predicting Video Slot Attention Queries from Random Slot-Feature Pairs
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Slot Attention with Re-Initialization and Self-Distillation
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Vector-Quantized Vision Foundation Models for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2025)
by: Zhao, Rongzhen, et al.
Published: (2025)
Multi-Scale Fusion for Object Representation
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Internalizing Temporal Consistency in Video Object-Centric Learning without Explicit Regularization
by: Zhao, Rongzhen, et al.
Published: (2026)
by: Zhao, Rongzhen, et al.
Published: (2026)
Organized Grouped Discrete Representation for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Grouped Discrete Representation for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Grouped Discrete Representation Guides Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2024)
by: Zhao, Rongzhen, et al.
Published: (2024)
Cycle Consistency in Video Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2026)
by: Zhao, Rongzhen, et al.
Published: (2026)
MetaSlot: Break Through the Fixed Number of Slots in Object-Centric Learning
by: Liu, Hongjia, et al.
Published: (2025)
by: Liu, Hongjia, et al.
Published: (2025)
Rethinking Temporal Consistency in Video Object-Centric Learning: From Prediction to Correspondence
by: Li, Zhiyuan, et al.
Published: (2026)
by: Li, Zhiyuan, et al.
Published: (2026)
Object-Centric Vision Token Pruning for Vision Language Models
by: Li, Guangyuan, et al.
Published: (2025)
by: Li, Guangyuan, et al.
Published: (2025)
PAWS: Perception of Articulation in the Wild at Scale from Egocentric Videos
by: Wang, Yihao, et al.
Published: (2026)
by: Wang, Yihao, et al.
Published: (2026)
A2-GNN: Angle-Annular GNN for Visual Descriptor-free Camera Relocalization
by: Zhang, Yejun, et al.
Published: (2025)
by: Zhang, Yejun, et al.
Published: (2025)
DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching
by: Wang, Shuzhe, et al.
Published: (2023)
by: Wang, Shuzhe, et al.
Published: (2023)
Scene-Agnostic Object-Centric Representation Learning for 3D Gaussian Splatting
by: Hsu, Tsuheng, et al.
Published: (2026)
by: Hsu, Tsuheng, et al.
Published: (2026)
Efficient NeRF Optimization -- Not All Samples Remain Equally Hard
by: Korhonen, Juuso, et al.
Published: (2024)
by: Korhonen, Juuso, et al.
Published: (2024)
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
by: Moshtaghi, Mehdi, et al.
Published: (2025)
by: Moshtaghi, Mehdi, et al.
Published: (2025)
Sources of Uncertainty in 3D Scene Reconstruction
by: Klasson, Marcus, et al.
Published: (2024)
by: Klasson, Marcus, et al.
Published: (2024)
Latent-Compressed Variational Autoencoder for Video Diffusion Models
by: Guan, Jiarui, et al.
Published: (2026)
by: Guan, Jiarui, et al.
Published: (2026)
OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives
by: Liu, Guiyu, et al.
Published: (2026)
by: Liu, Guiyu, et al.
Published: (2026)
Medical Image Segmentation with SAM-generated Annotations
by: Häkkinen, Iira, et al.
Published: (2024)
by: Häkkinen, Iira, et al.
Published: (2024)
Gaussian Splatting in Mirrors: Reflection-Aware Rendering via Virtual Camera Optimization
by: Wang, Zihan, et al.
Published: (2024)
by: Wang, Zihan, et al.
Published: (2024)
Differentiable Product Quantization for Memory Efficient Camera Relocalization
by: Laskar, Zakaria, et al.
Published: (2024)
by: Laskar, Zakaria, et al.
Published: (2024)
Slot-VAE: Object-Centric Scene Generation with Slot Attention
by: Wang, Yanbo, et al.
Published: (2023)
by: Wang, Yanbo, et al.
Published: (2023)
3D Gaussian Splatting with Fisheye Images: Field of View Analysis and Depth-Based Initialization
by: Gunes, Ulas, et al.
Published: (2025)
by: Gunes, Ulas, et al.
Published: (2025)
FingerVeinSyn-5M: A Million-Scale Dataset and Benchmark for Finger Vein Recognition
by: Wang, Yinfan, et al.
Published: (2025)
by: Wang, Yinfan, et al.
Published: (2025)
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation
by: Fang, Junyuan, et al.
Published: (2025)
by: Fang, Junyuan, et al.
Published: (2025)
FIORD: A Fisheye Indoor-Outdoor Dataset with LIDAR Ground Truth for 3D Scene Reconstruction and Benchmarking
by: Gunes, Ulas, et al.
Published: (2025)
by: Gunes, Ulas, et al.
Published: (2025)
DN-Splatter: Depth and Normal Priors for Gaussian Splatting and Meshing
by: Turkulainen, Matias, et al.
Published: (2024)
by: Turkulainen, Matias, et al.
Published: (2024)
HybVIO: Pushing the Limits of Real-time Visual-inertial Odometry
by: Seiskari, Otto, et al.
Published: (2021)
by: Seiskari, Otto, et al.
Published: (2021)
MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis
by: Ren, Xuqian, et al.
Published: (2023)
by: Ren, Xuqian, et al.
Published: (2023)
Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization
by: Dong, Siyan, et al.
Published: (2024)
by: Dong, Siyan, et al.
Published: (2024)
Attention Normalization Impacts Cardinality Generalization in Slot Attention
by: Krimmel, Markus, et al.
Published: (2024)
by: Krimmel, Markus, et al.
Published: (2024)
Adaptive Slot Attention: Object Discovery with Dynamic Slot Number
by: Fan, Ke, et al.
Published: (2024)
by: Fan, Ke, et al.
Published: (2024)
PRISM: Progressive Reasoning through Iterative Slot Memory for Vision
by: Wang, Ziyu, et al.
Published: (2026)
by: Wang, Ziyu, et al.
Published: (2026)
AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones
by: Ren, Xuqian, et al.
Published: (2024)
by: Ren, Xuqian, et al.
Published: (2024)
DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering
by: Wang, Yihao, et al.
Published: (2024)
by: Wang, Yihao, et al.
Published: (2024)
Guided Slot Attention for Unsupervised Video Object Segmentation
by: Lee, Minhyeok, et al.
Published: (2023)
by: Lee, Minhyeok, et al.
Published: (2023)
MUFASA: A Multi-Layer Framework for Slot Attention
by: Bock, Sebastian, et al.
Published: (2026)
by: Bock, Sebastian, et al.
Published: (2026)
Similar Items
-
Predicting Video Slot Attention Queries from Random Slot-Feature Pairs
by: Zhao, Rongzhen, et al.
Published: (2025) -
Slot Attention with Re-Initialization and Self-Distillation
by: Zhao, Rongzhen, et al.
Published: (2025) -
Vector-Quantized Vision Foundation Models for Object-Centric Learning
by: Zhao, Rongzhen, et al.
Published: (2025) -
Multi-Scale Fusion for Object Representation
by: Zhao, Rongzhen, et al.
Published: (2024) -
Internalizing Temporal Consistency in Video Object-Centric Learning without Explicit Regularization
by: Zhao, Rongzhen, et al.
Published: (2026)