Saved in:
| Main Authors: | Peng, Kunyu, Zhou, Zhikun, Yang, Kailun, Wen, Di, Liu, Ruiping, Chen, Yufan, Zheng, Junwei, Shi, Hao, Zhou, Yi, Sarfraz, M. Saquib, Paudel, Danda Pani, Van Gool, Luc |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.18431 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalization under Noisy Labels
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection
by: Wen, Di, et al.
Published: (2025)
by: Wen, Di, et al.
Published: (2025)
RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba
by: Peng, Kunyu, et al.
Published: (2025)
by: Peng, Kunyu, et al.
Published: (2025)
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?
by: Dongfang, Zihao, et al.
Published: (2025)
by: Dongfang, Zihao, et al.
Published: (2025)
InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing
by: Yang, Yebin, et al.
Published: (2026)
by: Yang, Yebin, et al.
Published: (2026)
ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction
by: Zhang, Yuheng, et al.
Published: (2026)
by: Zhang, Yuheng, et al.
Published: (2026)
Inferring Compositional 4D Scenes without Ever Seeing One
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
Lego: Learning to Disentangle and Invert Personalized Concepts Beyond Object Appearance in Text-to-Image Diffusion Models
by: Motamed, Saman, et al.
Published: (2023)
by: Motamed, Saman, et al.
Published: (2023)
Autonomous Vehicle Controllers From End-to-End Differentiable Simulation
by: Nachkov, Asen, et al.
Published: (2024)
by: Nachkov, Asen, et al.
Published: (2024)
EvenNICER-SLAM: Event-based Neural Implicit Encoding SLAM
by: Chen, Shi, et al.
Published: (2024)
by: Chen, Shi, et al.
Published: (2024)
EgoSpot:Egocentric Multimodal Control for Hands-Free Mobile Manipulation
by: Zhang, Ganlin, et al.
Published: (2023)
by: Zhang, Ganlin, et al.
Published: (2023)
Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation
by: Liu, Ruiping, et al.
Published: (2024)
by: Liu, Ruiping, et al.
Published: (2024)
Exo2EgoSyn: Unlocking Foundation Video Generation Models for Exocentric-to-Egocentric Video Synthesis
by: Mahdi, Mohammad, et al.
Published: (2025)
by: Mahdi, Mohammad, et al.
Published: (2025)
EgoExoMem: Cross-View Memory Reasoning over Synchronized Egocentric and Exocentric Videos
by: Liu, Ruiping, et al.
Published: (2026)
by: Liu, Ruiping, et al.
Published: (2026)
Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes
by: Ma, Qi, et al.
Published: (2024)
by: Ma, Qi, et al.
Published: (2024)
Continuous Pose for Monocular Cameras in Neural Implicit Representation
by: Ma, Qi, et al.
Published: (2023)
by: Ma, Qi, et al.
Published: (2023)
From Synchrony to Sequence: Exo-to-Ego Generation via Interpolation
by: Mahdi, Mohammad, et al.
Published: (2026)
by: Mahdi, Mohammad, et al.
Published: (2026)
Vision encoders should be image size agnostic and task driven
by: Prisadnikov, Nedyalko, et al.
Published: (2025)
by: Prisadnikov, Nedyalko, et al.
Published: (2025)
Self-supervised pretraining for an iterative image size agnostic vision transformer
by: Prisadnikov, Nedyalko, et al.
Published: (2026)
by: Prisadnikov, Nedyalko, et al.
Published: (2026)
Referring Atomic Video Action Recognition
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
A Simple and Generalist Approach for Panoptic Segmentation
by: Prisadnikov, Nedyalko, et al.
Published: (2024)
by: Prisadnikov, Nedyalko, et al.
Published: (2024)
Generalist Robot Manipulation beyond Action Labeled Data
by: Spiridonov, Alexander, et al.
Published: (2025)
by: Spiridonov, Alexander, et al.
Published: (2025)
ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models
by: Dey, Sombit, et al.
Published: (2024)
by: Dey, Sombit, et al.
Published: (2024)
EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering
by: Li, Yanjun, et al.
Published: (2025)
by: Li, Yanjun, et al.
Published: (2025)
Taming CLIP for Fine-grained and Structured Visual Understanding of Museum Exhibits
by: Balauca, Ada-Astrid, et al.
Published: (2024)
by: Balauca, Ada-Astrid, et al.
Published: (2024)
RICO: Two Realistic Benchmarks and an In-Depth Analysis for Incremental Learning in Object Detection
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
Incremental Object Detection with Prompt-based Methods
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
by: Neuwirth-Trapp, Matthias, et al.
Published: (2025)
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
by: Cheng, Jiahuan, et al.
Published: (2024)
by: Cheng, Jiahuan, et al.
Published: (2024)
SeasonScapes: Learning Large-scale Re-lightable 3D Landscapes with Seasonal Variation from Sparse Webcams
by: Kleger, Timo, et al.
Published: (2026)
by: Kleger, Timo, et al.
Published: (2026)
Ternary-Type Opacity and Hybrid Odometry for RGB NeRF-SLAM
by: Lin, Junru, et al.
Published: (2023)
by: Lin, Junru, et al.
Published: (2023)
Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
by: Fu, Yuqian, et al.
Published: (2025)
by: Fu, Yuqian, et al.
Published: (2025)
GaussianVLM: Scene-centric 3D Vision-Language Models using Language-aligned Gaussian Splats for Embodied Reasoning and Beyond
by: Halacheva, Anna-Maria, et al.
Published: (2025)
by: Halacheva, Anna-Maria, et al.
Published: (2025)
Mitigating Label Noise using Prompt-Based Hyperbolic Meta-Learning in Open-Set Domain Generalization
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
Autonomous Vehicle Path Planning by Searching With Differentiable Simulation
by: Nachkov, Asen, et al.
Published: (2025)
by: Nachkov, Asen, et al.
Published: (2025)
Unlocking Efficient Vehicle Dynamics Modeling via Analytic World Models
by: Nachkov, Asen, et al.
Published: (2025)
by: Nachkov, Asen, et al.
Published: (2025)
FireScope: Wildfire Risk Raster Prediction with a Chain-of-Thought Oracle
by: Markov, Mario, et al.
Published: (2025)
by: Markov, Mario, et al.
Published: (2025)
B-GRTO: Bootstrapped Group Relative Tool Optimization for Referring Segmentation
by: Markov, Mario, et al.
Published: (2026)
by: Markov, Mario, et al.
Published: (2026)
DriveXQA: Cross-modal Visual Question Answering for Adverse Driving Scene Understanding
by: Tao, Mingzhe, et al.
Published: (2026)
by: Tao, Mingzhe, et al.
Published: (2026)
$M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs
by: Lin, Kaixin, et al.
Published: (2026)
by: Lin, Kaixin, et al.
Published: (2026)
Advancing Open-Set Domain Generalization Using Evidential Bi-Level Hardest Domain Scheduler
by: Peng, Kunyu, et al.
Published: (2024)
by: Peng, Kunyu, et al.
Published: (2024)
Similar Items
-
EReLiFM: Evidential Reliability-Aware Residual Flow Meta-Learning for Open-Set Domain Generalization under Noisy Labels
by: Peng, Kunyu, et al.
Published: (2025) -
RoHOI: Robustness Benchmark for Human-Object Interaction Detection
by: Wen, Di, et al.
Published: (2025) -
RefAtomNet++: Advancing Referring Atomic Video Action Recognition using Semantic Retrieval based Multi-Trajectory Mamba
by: Peng, Kunyu, et al.
Published: (2025) -
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?
by: Dongfang, Zihao, et al.
Published: (2025) -
InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing
by: Yang, Yebin, et al.
Published: (2026)