:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jia, Sen, Li, Lei
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.03161
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)

Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
by: Yuan, Xinbin, et al.
Published: (2025)

Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs
by: Yu, Jiaao, et al.
Published: (2025)

Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement
by: Malik, Ashish, et al.
Published: (2026)

FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
by: Cao, Zhuo, et al.
Published: (2024)

Enhancing Monotonic Modeling with Spatio-Temporal Adaptive Awareness in Diverse Marketing
by: Li, Bin, et al.
Published: (2024)

GuirlVG: Incentivize GUI Visual Grounding via Empirical Exploration on Reinforcement Learning
by: Kang, Weitai, et al.
Published: (2025)

Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation
by: Jiang, Yubo, et al.
Published: (2026)

GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)

Instruction-Guided Visual Masking
by: Zheng, Jinliang, et al.
Published: (2024)

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs
by: Pan, Jun-Yu, et al.
Published: (2026)

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
by: Joo, Taejong, et al.
Published: (2026)

InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation
by: Chen, Qiaosheng, et al.
Published: (2025)

Adaptive Optimization for Enhanced Efficiency in Large-Scale Language Model Training
by: Chen, Jiajing, et al.
Published: (2024)

Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation
by: Li, Chenyu, et al.
Published: (2024)

Fragment-Masked Diffusion for Molecular Optimization
by: Li, Kun, et al.
Published: (2024)

VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
by: Li, Hao, et al.
Published: (2025)

Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
by: Wang, Haochen, et al.
Published: (2025)

Towards Automatic Soccer Commentary Generation with Knowledge-Enhanced Visual Reasoning
by: Jin, Zeyu, et al.
Published: (2026)

Visual Position Prompt for MLLM based Visual Grounding
by: Tang, Wei, et al.
Published: (2025)

SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning
by: Jia, Furong, et al.
Published: (2026)

Aria-UI: Visual Grounding for GUI Instructions
by: Yang, Yuhao, et al.
Published: (2024)

TruthLens: Visual Grounding for Universal DeepFake Reasoning
by: Kundu, Rohit, et al.
Published: (2025)

AutoFed: Personalized Federated Traffic Prediction via Adaptive Prompt
by: Zhao, Zijian, et al.
Published: (2025)

EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models
by: Villa, Andrés, et al.
Published: (2025)

Enhancing Radiology Report Generation and Visual Grounding using Reinforcement Learning
by: Gundersen, Benjamin, et al.
Published: (2025)

ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning
by: Xu, Ziqiang, et al.
Published: (2025)

Generative AI-Enhanced Cooperative MEC of UAVs and Ground Stations for Unmanned Surface Vehicles
by: You, Jiahao, et al.
Published: (2025)

Quantum-Enhanced Adversarial Robustness in Artificial Intelligence
by: Sen, Jaydip
Published: (2026)

Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
by: Liu, Quan, et al.
Published: (2024)

HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning
by: Książek, Kamil, et al.
Published: (2023)

INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
by: Zhang, Hanbo, et al.
Published: (2021)

OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like Mechanisms
by: AI, Lumen, et al.
Published: (2025)

Grounding and Enhancing Informativeness and Utility in Dataset Distillation
by: Wang, Shaobo, et al.
Published: (2026)

Residual Tokens Enhance Masked Autoencoders for Speech Modeling
by: Sadok, Samir, et al.
Published: (2026)

Think with Grounding: Curriculum Reinforced Reasoning with Video Grounding for Long Video Understanding
by: Chen, Houlun, et al.
Published: (2026)

ADMFormer: An Adaptive-Decomposition Transformer with Time-Varying Masked Spatial Attention for Traffic Forecasting
by: Gu, Ruiwen, et al.
Published: (2026)

SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes
by: Li, Kuan, et al.
Published: (2026)

ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis
by: Li, Lei, et al.
Published: (2025)

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
by: Huy, Ta Duc, et al.
Published: (2025)