Saved in:
| Main Authors: | Jiang, Sheng, Ning, Yuanmin, Huang, Bingxi, Chen, Peiyin, Chen, Zhaohui |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.00603 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DEMO: Disentangled Motion Latent Flow Matching for Fine-Grained Controllable Talking Portrait Synthesis
by: Chen, Peiyin, et al.
Published: (2025)
by: Chen, Peiyin, et al.
Published: (2025)
Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations
by: Huang, Yuanmin, et al.
Published: (2026)
by: Huang, Yuanmin, et al.
Published: (2026)
EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
by: Ge, Xuanyu, et al.
Published: (2026)
by: Ge, Xuanyu, et al.
Published: (2026)
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
by: Liu, Bingxi, et al.
Published: (2025)
by: Liu, Bingxi, et al.
Published: (2025)
Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation
by: Ye, Hualin, et al.
Published: (2025)
by: Ye, Hualin, et al.
Published: (2025)
Hierarchical Visual Relocalization with Nearest View Synthesis from Feature Gaussian Splatting
by: Tao, Huaqi, et al.
Published: (2026)
by: Tao, Huaqi, et al.
Published: (2026)
Evidence Packing for Cross-Domain Image Deepfake Detection with LVLMs
by: Liu, Yuxin, et al.
Published: (2026)
by: Liu, Yuxin, et al.
Published: (2026)
TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification
by: Tao, Huaqi, et al.
Published: (2025)
by: Tao, Huaqi, et al.
Published: (2025)
Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
by: Chen, Beitao, et al.
Published: (2025)
by: Chen, Beitao, et al.
Published: (2025)
MT-PCR: Hybrid Mamba-Transformer Network with Spatial Serialization for Point Cloud Registration
by: Liu, Bingxi, et al.
Published: (2025)
by: Liu, Bingxi, et al.
Published: (2025)
Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
by: Liu, Shi, et al.
Published: (2024)
by: Liu, Shi, et al.
Published: (2024)
MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
by: Chen, Huiyi, et al.
Published: (2025)
by: Chen, Huiyi, et al.
Published: (2025)
Unlocking Multilingual Reasoning Capability of LLMs and LVLMs through Representation Engineering
by: Li, Qiming, et al.
Published: (2025)
by: Li, Qiming, et al.
Published: (2025)
SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs
by: Yan, Bei, et al.
Published: (2025)
by: Yan, Bei, et al.
Published: (2025)
Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
by: Zhang, Jie, et al.
Published: (2024)
by: Zhang, Jie, et al.
Published: (2024)
HG-Lane: High-Fidelity Generation of Lane Scenes under Adverse Weather and Lighting Conditions without Re-annotation
by: Zhao, Daichao, et al.
Published: (2026)
by: Zhao, Daichao, et al.
Published: (2026)
V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs
by: Nie, Sen, et al.
Published: (2025)
by: Nie, Sen, et al.
Published: (2025)
Optimizing LVLMs with On-Policy Data for Effective Hallucination Mitigation
by: Yu, Chengzhi, et al.
Published: (2025)
by: Yu, Chengzhi, et al.
Published: (2025)
Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces
by: Chen, Zhaohui, et al.
Published: (2024)
by: Chen, Zhaohui, et al.
Published: (2024)
Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
by: Chen, Zhaohui, et al.
Published: (2024)
by: Chen, Zhaohui, et al.
Published: (2024)
Benchmarking Corruption Robustness of LVLMs: A Discriminative Benchmark and Robustness Alignment Metric
by: Sui, Xiangjie, et al.
Published: (2025)
by: Sui, Xiangjie, et al.
Published: (2025)
CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs
by: Kan, Zhehan, et al.
Published: (2024)
by: Kan, Zhehan, et al.
Published: (2024)
IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning
by: Sun, Zhichao, et al.
Published: (2026)
by: Sun, Zhichao, et al.
Published: (2026)
Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
by: Cao, Xiangkui, et al.
Published: (2026)
by: Cao, Xiangkui, et al.
Published: (2026)
Getting to the Point: Pointing Improves LVLMs at Counting
by: Alghisi, Simone, et al.
Published: (2026)
by: Alghisi, Simone, et al.
Published: (2026)
AutoV: Loss-Oriented Ranking for Visual Prompt Retrieval in LVLMs
by: Zhang, Yuan, et al.
Published: (2025)
by: Zhang, Yuan, et al.
Published: (2025)
See It, Say It, Sorted: An Iterative Training-Free Framework for Visually-Grounded Multimodal Reasoning in LVLMs
by: Zhang, Yongchang, et al.
Published: (2026)
by: Zhang, Yongchang, et al.
Published: (2026)
IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?
by: Chen, Yang, et al.
Published: (2025)
by: Chen, Yang, et al.
Published: (2025)
ZScribbleSeg: A comprehensive segmentation framework with modeling of efficient annotation and maximization of scribble supervision
by: Zhang, Ke, et al.
Published: (2026)
by: Zhang, Ke, et al.
Published: (2026)
MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
by: Liu, Xuannan, et al.
Published: (2024)
by: Liu, Xuannan, et al.
Published: (2024)
Self-Prophetic Decoding to Unlock Visual Search in LVLMs
by: He, Zhendong, et al.
Published: (2026)
by: He, Zhendong, et al.
Published: (2026)
Revealing Perception and Generation Dynamics in LVLMs: Mitigating Hallucinations via Validated Dominance Correction
by: Lyu, Guangtao, et al.
Published: (2025)
by: Lyu, Guangtao, et al.
Published: (2025)
Improving Alignment in LVLMs with Debiased Self-Judgment
by: Yang, Sihan, et al.
Published: (2025)
by: Yang, Sihan, et al.
Published: (2025)
TrackingMiM: Efficient Mamba-in-Mamba Serialization for Real-time UAV Object Tracking
by: Liu, Bingxi, et al.
Published: (2025)
by: Liu, Bingxi, et al.
Published: (2025)
SuperPlace: The Renaissance of Classical Feature Aggregation for Visual Place Recognition in the Era of Foundation Models
by: Liu, Bingxi, et al.
Published: (2025)
by: Liu, Bingxi, et al.
Published: (2025)
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
by: Liu, Xuannan, et al.
Published: (2025)
by: Liu, Xuannan, et al.
Published: (2025)
GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?
by: Zou, Yueying, et al.
Published: (2026)
by: Zou, Yueying, et al.
Published: (2026)
Remember Me: Bridging the Long-Range Gap in LVLMs with Three-Step Inference-Only Decay Resilience Strategies
by: Gao, Peng, et al.
Published: (2025)
by: Gao, Peng, et al.
Published: (2025)
CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
by: Huang, Xiaoyi, et al.
Published: (2026)
by: Huang, Xiaoyi, et al.
Published: (2026)
TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
by: Jiang, Lei, et al.
Published: (2025)
by: Jiang, Lei, et al.
Published: (2025)
Similar Items
-
DEMO: Disentangled Motion Latent Flow Matching for Fine-Grained Controllable Talking Portrait Synthesis
by: Chen, Peiyin, et al.
Published: (2025) -
Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations
by: Huang, Yuanmin, et al.
Published: (2026) -
EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
by: Ge, Xuanyu, et al.
Published: (2026) -
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
by: Liu, Bingxi, et al.
Published: (2025) -
Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation
by: Ye, Hualin, et al.
Published: (2025)