Saved in:
| Main Authors: | Contreras, Kebin, Toscano-Palomino, Luis, Mura, Mauro Dalla, Bacca, Jorge |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.05408 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications
by: Monroy, Brayan, et al.
Published: (2025)
by: Monroy, Brayan, et al.
Published: (2025)
High Dynamic Range Modulo Imaging for Robust Object Detection in Autonomous Driving
by: Contreras, Kebin, et al.
Published: (2025)
by: Contreras, Kebin, et al.
Published: (2025)
Leveraging pretrained RGB denoisers for hyperspectral image restoration
by: Picone, Daniele, et al.
Published: (2026)
by: Picone, Daniele, et al.
Published: (2026)
PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes
by: Peng, Kebin, et al.
Published: (2024)
by: Peng, Kebin, et al.
Published: (2024)
Projection-Based Correction for Enhancing Deep Inverse Networks
by: Bacca, Jorge
Published: (2025)
by: Bacca, Jorge
Published: (2025)
Seeing Beyond the Scene: Enhancing Vision-Language Models with Interactional Reasoning
by: Liang, Dayong, et al.
Published: (2025)
by: Liang, Dayong, et al.
Published: (2025)
SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model
by: Avetisyan, Armen, et al.
Published: (2024)
by: Avetisyan, Armen, et al.
Published: (2024)
Positional Bias in Multimodal Embedding Models: Do They Favor the Beginning, the Middle, or the End?
by: Wu, Kebin, et al.
Published: (2025)
by: Wu, Kebin, et al.
Published: (2025)
TraceFlow: Dynamic 3D Reconstruction of Specular Scenes Driven by Ray Tracing
by: Tao, Jiachen, et al.
Published: (2025)
by: Tao, Jiachen, et al.
Published: (2025)
Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes
by: Feng, Zhiyuan, et al.
Published: (2025)
by: Feng, Zhiyuan, et al.
Published: (2025)
Seeing Through Clutter: Structured 3D Scene Reconstruction via Iterative Object Removal
by: Aguina-Kang, Rio, et al.
Published: (2026)
by: Aguina-Kang, Rio, et al.
Published: (2026)
Deep Lightweight Unrolled Network for High Dynamic Range Modulo Imaging
by: Monroy, Brayan, et al.
Published: (2026)
by: Monroy, Brayan, et al.
Published: (2026)
Scale Equivariance Regularization and Feature Lifting in High Dynamic Range Modulo Imaging
by: Monroy, Brayan, et al.
Published: (2026)
by: Monroy, Brayan, et al.
Published: (2026)
Seeing Through Words: Controlling Visual Retrieval Quality with Language Models
by: Lu, Jianglin, et al.
Published: (2026)
by: Lu, Jianglin, et al.
Published: (2026)
Seeing Through Reflections: Advancing 3D Scene Reconstruction in Mirror-Containing Environments with Gaussian Splatting
by: Guo, Zijing, et al.
Published: (2025)
by: Guo, Zijing, et al.
Published: (2025)
VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models
by: Kumar, Gokul Karthik, et al.
Published: (2025)
by: Kumar, Gokul Karthik, et al.
Published: (2025)
SLARM: Streaming and Language-Aligned Reconstruction Model for Dynamic Scenes
by: Qiu, Zhicheng, et al.
Published: (2026)
by: Qiu, Zhicheng, et al.
Published: (2026)
Seeing Speech and Sound: Distinguishing and Locating Audios in Visual Scenes
by: Ryu, Hyeonggon, et al.
Published: (2025)
by: Ryu, Hyeonggon, et al.
Published: (2025)
Unified Scene Representation and Reconstruction for 3D Large Language Models
by: Chu, Tao, et al.
Published: (2024)
by: Chu, Tao, et al.
Published: (2024)
Hadamard Row-Wise Generation Algorithm
by: Monroy, Brayan, et al.
Published: (2024)
by: Monroy, Brayan, et al.
Published: (2024)
Seeing the Scene Matters: Revealing Forgetting in Video Understanding Models with a Scene-Aware Long-Video Benchmark
by: Chen, Seng Nam, et al.
Published: (2026)
by: Chen, Seng Nam, et al.
Published: (2026)
Seeing Is Believing? A Benchmark for Multimodal Large Language Models on Visual Illusions and Anomalies
by: Hou, Wenjin, et al.
Published: (2026)
by: Hou, Wenjin, et al.
Published: (2026)
Seeing the Evidence, Missing the Answer: Tool-Guided Vision-Language Models on Visual Illusions
by: Wang, Xuesong, et al.
Published: (2026)
by: Wang, Xuesong, et al.
Published: (2026)
AdaptInfer: Adaptive Token Pruning for Vision-Language Model Inference with Dynamical Text Guidance
by: Zhang, Weichen, et al.
Published: (2025)
by: Zhang, Weichen, et al.
Published: (2025)
SplitGaussian: Reconstructing Dynamic Scenes via Visual Geometry Decomposition
by: Li, Jiahui, et al.
Published: (2025)
by: Li, Jiahui, et al.
Published: (2025)
Do Vision-Language Models See Urban Scenes as People Do? An Urban Perception Benchmark
by: Mushkani, Rashid
Published: (2025)
by: Mushkani, Rashid
Published: (2025)
WaterHE-NeRF: Water-ray Tracing Neural Radiance Fields for Underwater Scene Reconstruction
by: Zhou, Jingchun, et al.
Published: (2023)
by: Zhou, Jingchun, et al.
Published: (2023)
Scenario Understanding of Traffic Scenes Through Large Visual Language Models
by: Rivera, Esteban, et al.
Published: (2025)
by: Rivera, Esteban, et al.
Published: (2025)
Seeing the Arrow of Time in Large Multimodal Models
by: Xue, Zihui, et al.
Published: (2025)
by: Xue, Zihui, et al.
Published: (2025)
Seeing is Believing? Enhancing Vision-Language Navigation using Visual Perturbations
by: Zhang, Xuesong, et al.
Published: (2024)
by: Zhang, Xuesong, et al.
Published: (2024)
Seeing through Imagination: Learning Scene Geometry via Implicit Spatial World Modeling
by: Cao, Meng, et al.
Published: (2025)
by: Cao, Meng, et al.
Published: (2025)
DepthFocus: Controllable Depth Estimation for See-Through Scenes
by: Min, Junhong, et al.
Published: (2025)
by: Min, Junhong, et al.
Published: (2025)
EAG-PT: Emission-Aware Gaussians and Path Tracing for Diffuse Indoor Scene Reconstruction and Editing
by: Yang, Xijie, et al.
Published: (2026)
by: Yang, Xijie, et al.
Published: (2026)
Eye-See-You: Reverse Pass-Through VR and Head Avatars
by: Dash, Ankan, et al.
Published: (2025)
by: Dash, Ankan, et al.
Published: (2025)
Vision-Language Models Can't See the Obvious
by: Dahou, Yasser, et al.
Published: (2025)
by: Dahou, Yasser, et al.
Published: (2025)
Real-Time Scene Reconstruction using Light Field Probes
by: Liu, Yaru, et al.
Published: (2025)
by: Liu, Yaru, et al.
Published: (2025)
Inferring Compositional 4D Scenes without Ever Seeing One
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
by: Gokmen, Ahmet Berke, et al.
Published: (2025)
Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces
by: Luo, Gen, et al.
Published: (2025)
by: Luo, Gen, et al.
Published: (2025)
Robust Scene Change Detection Using Visual Foundation Models and Cross-Attention Mechanisms
by: Lin, Chun-Jung, et al.
Published: (2024)
by: Lin, Chun-Jung, et al.
Published: (2024)
Toward Autonomous Laboratory Safety Monitoring with Vision Language Models: Learning to See Hazards Through Scene Structure
by: Chakraborty, Trishna, et al.
Published: (2026)
by: Chakraborty, Trishna, et al.
Published: (2026)
Similar Items
-
Autoregressive High-Order Finite Difference Modulo Imaging: High-Dynamic Range for Computer Vision Applications
by: Monroy, Brayan, et al.
Published: (2025) -
High Dynamic Range Modulo Imaging for Robust Object Detection in Autonomous Driving
by: Contreras, Kebin, et al.
Published: (2025) -
Leveraging pretrained RGB denoisers for hyperspectral image restoration
by: Picone, Daniele, et al.
Published: (2026) -
PMPNet: Pixel Movement Prediction Network for Monocular Depth Estimation in Dynamic Scenes
by: Peng, Kebin, et al.
Published: (2024) -
Projection-Based Correction for Enhancing Deep Inverse Networks
by: Bacca, Jorge
Published: (2025)