:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	An, Zhaochong, Kupyn, Orest, Uscidda, Théo, Colaco, Andrea, Ahuja, Karan, Belongie, Serge, Gonzalez-Franco, Mar, Gazulla, Marta Tintore
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.26599
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dataset Enhancement with Instance-Level Augmentations
by: Kupyn, Orest, et al.
Published: (2024)

S3OD: Towards Generalizable Salient Object Detection with Synthetic Data
by: Kupyn, Orest, et al.
Published: (2025)

VGGHeads: 3D Multi Head Alignment with a Large-Scale Synthetic Dataset
by: Kupyn, Orest, et al.
Published: (2024)

Epipolar Geometry Improves Video Generation Models
by: Kupyn, Orest, et al.
Published: (2025)

SurfaceXR: Fusing Smartwatch IMUs and Egocentric Hand Pose for Seamless Surface Interactions
by: Xu, Vasco, et al.
Published: (2026)

Geometry Fidelity for Spherical Images
by: Christensen, Anders, et al.
Published: (2024)

Augmented Object Intelligence with XR-Objects
by: Dogan, Mustafa Doga, et al.
Published: (2024)

MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning
by: Segu, Mattia, et al.
Published: (2025)

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
by: An, Zhaochong, et al.
Published: (2025)

PARSE-Ego4D: Personal Action Recommendation Suggestions for Egocentric Videos
by: Abreu, Steven, et al.
Published: (2024)

DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image
by: Martyniuk, Tetiana, et al.
Published: (2022)

Rethinking Few-shot 3D Point Cloud Semantic Segmentation
by: An, Zhaochong, et al.
Published: (2024)

Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
by: An, Zhaochong, et al.
Published: (2024)

Shifting the Breaking Point of Flow Matching for Multi-Instance Editing
by: Zaccagnino, Carmine, et al.
Published: (2026)

GeOT: A spatially explicit framework for evaluating spatio-temporal predictions
by: Wiedemann, Nina, et al.
Published: (2024)

ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis
by: Li, Lei, et al.
Published: (2025)

Revisiting the Perception-Distortion Trade-off with Spatial-Semantic Guided Super-Resolution
by: Wang, Dan, et al.
Published: (2026)

PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models
by: Prospero, Lorenza, et al.
Published: (2026)

Symbiotic AI: Augmenting Human Cognition from PCs to Cars
by: Bovo, Riccardo, et al.
Published: (2025)

Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
by: Tian, Junjiao, et al.
Published: (2023)

Practical and Rich User Digitization
by: Ahuja, Karan
Published: (2024)

EmBARDiment: an Embodied AI Agent for Productivity in XR
by: Bovo, Riccardo, et al.
Published: (2024)

The Latent Color Subspace: Emergent Order in High-Dimensional Chaos
by: Pach, Mateusz, et al.
Published: (2026)

Generalized Discrete Diffusion from Snapshots
by: Zekri, Oussama, et al.
Published: (2026)

GENOT: Entropic (Gromov) Wasserstein Flow Matching with Applications to Single-Cell Genomics
by: Klein, Dominik, et al.
Published: (2023)

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning
by: Li, Chengzu, et al.
Published: (2026)

Noise-Coded Illumination for Forensic and Photometric Video Analysis
by: Michael, Peter F., et al.
Published: (2025)

Video Understanding: From Geometry and Semantics to Unified Models
by: An, Zhaochong, et al.
Published: (2026)

PhysConvex: Physics-Informed 3D Dynamic Convex Radiance Fields for Reconstruction and Simulation
by: Wang, Dan, et al.
Published: (2026)

Text Entry for XR Trove (TEXT): Collecting and Analyzing Techniques for Text Input in XR
by: Bhatia, Arpit, et al.
Published: (2025)

Unlearning-based Neural Interpretations
by: Choi, Ching Lam, et al.
Published: (2024)

Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes
by: Deng, Shiling, et al.
Published: (2025)

Panel on Synthesizing BPs .
by: Tintore, Joaquin
Published: (2019)

Real-time surface current data in the Ibiza Channel from January to October 2016
by: Tintore, Joaquín
Published: (2016)

Stitched Value Model for Diffusion Alignment
by: Go, Hyojun, et al.
Published: (2026)

OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
by: An, Zhaochong, et al.
Published: (2025)

MMEarth-Bench: Global Model Adaptation via Multimodal Test-Time Training
by: Gordon, Lucia, et al.
Published: (2026)

Assessing Neural Network Robustness via Adversarial Pivotal Tuning
by: Christensen, Peter Ebert, et al.
Published: (2022)

From Videos to Conversations: Egocentric Instructions for Task Assistance
by: Aggarwal, Lavisha, et al.
Published: (2026)

Disentangled Representation Learning with the Gromov-Monge Gap
by: Uscidda, Théo, et al.
Published: (2024)