:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Sheng, Ning, Yuanmin, Huang, Bingxi, Chen, Peiyin, Chen, Zhaohui
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2510.00603
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DEMO: Disentangled Motion Latent Flow Matching for Fine-Grained Controllable Talking Portrait Synthesis
by: Chen, Peiyin, et al.
Published: (2025)

Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations
by: Huang, Yuanmin, et al.
Published: (2026)

EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
by: Ge, Xuanyu, et al.
Published: (2026)

EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
by: Liu, Bingxi, et al.
Published: (2025)

Learnable Query Aggregation with KV Routing for Cross-view Geo-localisation
by: Ye, Hualin, et al.
Published: (2025)

Hierarchical Visual Relocalization with Nearest View Synthesis from Feature Gaussian Splatting
by: Tao, Huaqi, et al.
Published: (2026)

Evidence Packing for Cross-Domain Image Deepfake Detection with LVLMs
by: Liu, Yuxin, et al.
Published: (2026)

TextInPlace: Indoor Visual Place Recognition in Repetitive Structures with Scene Text Spotting and Verification
by: Tao, Huaqi, et al.
Published: (2025)

Attention Hijackers: Detect and Disentangle Attention Hijacking in LVLMs for Hallucination Mitigation
by: Chen, Beitao, et al.
Published: (2025)

MT-PCR: Hybrid Mamba-Transformer Network with Spatial Serialization for Point Cloud Registration
by: Liu, Bingxi, et al.
Published: (2025)

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
by: Liu, Shi, et al.
Published: (2024)

MVI-Bench: A Comprehensive Benchmark for Evaluating Robustness to Misleading Visual Inputs in LVLMs
by: Chen, Huiyi, et al.
Published: (2025)

Unlocking Multilingual Reasoning Capability of LLMs and LVLMs through Representation Engineering
by: Li, Qiming, et al.
Published: (2025)

SHALE: A Scalable Benchmark for Fine-grained Hallucination Evaluation in LVLMs
by: Yan, Bei, et al.
Published: (2025)

Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs
by: Zhang, Jie, et al.
Published: (2024)

HG-Lane: High-Fidelity Generation of Lane Scenes under Adverse Weather and Lighting Conditions without Re-annotation
by: Zhao, Daichao, et al.
Published: (2026)

V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs
by: Nie, Sen, et al.
Published: (2025)

Optimizing LVLMs with On-Policy Data for Effective Hallucination Mitigation
by: Yu, Chengzhi, et al.
Published: (2025)

Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces
by: Chen, Zhaohui, et al.
Published: (2024)

Robust feature knowledge distillation for enhanced performance of lightweight crack segmentation models
by: Chen, Zhaohui, et al.
Published: (2024)

Benchmarking Corruption Robustness of LVLMs: A Discriminative Benchmark and Robustness Alignment Metric
by: Sui, Xiangjie, et al.
Published: (2025)

CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs
by: Kan, Zhehan, et al.
Published: (2024)

IVC-Prune: Revealing the Implicit Visual Coordinates in LVLMs for Vision Token Pruning
by: Sun, Zhichao, et al.
Published: (2026)

Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
by: Cao, Xiangkui, et al.
Published: (2026)

Getting to the Point: Pointing Improves LVLMs at Counting
by: Alghisi, Simone, et al.
Published: (2026)

AutoV: Loss-Oriented Ranking for Visual Prompt Retrieval in LVLMs
by: Zhang, Yuan, et al.
Published: (2025)

See It, Say It, Sorted: An Iterative Training-Free Framework for Visually-Grounded Multimodal Reasoning in LVLMs
by: Zhang, Yongchang, et al.
Published: (2026)

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?
by: Chen, Yang, et al.
Published: (2025)

ZScribbleSeg: A comprehensive segmentation framework with modeling of efficient annotation and maximization of scribble supervision
by: Zhang, Ke, et al.
Published: (2026)

MMFakeBench: A Mixed-Source Multimodal Misinformation Detection Benchmark for LVLMs
by: Liu, Xuannan, et al.
Published: (2024)

Self-Prophetic Decoding to Unlock Visual Search in LVLMs
by: He, Zhendong, et al.
Published: (2026)

Revealing Perception and Generation Dynamics in LVLMs: Mitigating Hallucinations via Validated Dominance Correction
by: Lyu, Guangtao, et al.
Published: (2025)

Improving Alignment in LVLMs with Debiased Self-Judgment
by: Yang, Sihan, et al.
Published: (2025)

TrackingMiM: Efficient Mamba-in-Mamba Serialization for Real-time UAV Object Tracking
by: Liu, Bingxi, et al.
Published: (2025)

SuperPlace: The Renaissance of Classical Feature Aggregation for Visual Place Recognition in the Era of Foundation Models
by: Liu, Bingxi, et al.
Published: (2025)

Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
by: Liu, Xuannan, et al.
Published: (2025)

GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?
by: Zou, Yueying, et al.
Published: (2026)

Remember Me: Bridging the Long-Range Gap in LVLMs with Three-Step Inference-Only Decay Resilience Strategies
by: Gao, Peng, et al.
Published: (2025)

CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
by: Huang, Xiaoyi, et al.
Published: (2026)

TARAC: Mitigating Hallucination in LVLMs via Temporal Attention Real-time Accumulative Connection
by: Jiang, Lei, et al.
Published: (2025)