Saved in:
| Main Authors: | Jia, Sen, Li, Lei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.03161 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025)
by: Lei, Bin, et al.
Published: (2025)
Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
by: Yuan, Xinbin, et al.
Published: (2025)
by: Yuan, Xinbin, et al.
Published: (2025)
Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs
by: Yu, Jiaao, et al.
Published: (2025)
by: Yu, Jiaao, et al.
Published: (2025)
Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement
by: Malik, Ashish, et al.
Published: (2026)
by: Malik, Ashish, et al.
Published: (2026)
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
by: Cao, Zhuo, et al.
Published: (2024)
by: Cao, Zhuo, et al.
Published: (2024)
Enhancing Monotonic Modeling with Spatio-Temporal Adaptive Awareness in Diverse Marketing
by: Li, Bin, et al.
Published: (2024)
by: Li, Bin, et al.
Published: (2024)
GuirlVG: Incentivize GUI Visual Grounding via Empirical Exploration on Reinforcement Learning
by: Kang, Weitai, et al.
Published: (2025)
by: Kang, Weitai, et al.
Published: (2025)
Global Context or Local Detail? Adaptive Visual Grounding for Hallucination Mitigation
by: Jiang, Yubo, et al.
Published: (2026)
by: Jiang, Yubo, et al.
Published: (2026)
GUI-ARP: Enhancing Grounding with Adaptive Region Perception for GUI Agents
by: Ye, Xianhang, et al.
Published: (2025)
by: Ye, Xianhang, et al.
Published: (2025)
Instruction-Guided Visual Masking
by: Zheng, Jinliang, et al.
Published: (2024)
by: Zheng, Jinliang, et al.
Published: (2024)
Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs
by: Pan, Jun-Yu, et al.
Published: (2026)
by: Pan, Jun-Yu, et al.
Published: (2026)
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
by: Joo, Taejong, et al.
Published: (2026)
by: Joo, Taejong, et al.
Published: (2026)
InteractScience: Programmatic and Visually-Grounded Evaluation of Interactive Scientific Demonstration Code Generation
by: Chen, Qiaosheng, et al.
Published: (2025)
by: Chen, Qiaosheng, et al.
Published: (2025)
Adaptive Optimization for Enhanced Efficiency in Large-Scale Language Model Training
by: Chen, Jiajing, et al.
Published: (2024)
by: Chen, Jiajing, et al.
Published: (2024)
Look Through Masks: Towards Masked Face Recognition with De-Occlusion Distillation
by: Li, Chenyu, et al.
Published: (2024)
by: Li, Chenyu, et al.
Published: (2024)
Fragment-Masked Diffusion for Molecular Optimization
by: Li, Kun, et al.
Published: (2024)
by: Li, Kun, et al.
Published: (2024)
VEGAS: Towards Visually Explainable and Grounded Artificial Social Intelligence
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
by: Wang, Haochen, et al.
Published: (2025)
by: Wang, Haochen, et al.
Published: (2025)
Towards Automatic Soccer Commentary Generation with Knowledge-Enhanced Visual Reasoning
by: Jin, Zeyu, et al.
Published: (2026)
by: Jin, Zeyu, et al.
Published: (2026)
Visual Position Prompt for MLLM based Visual Grounding
by: Tang, Wei, et al.
Published: (2025)
by: Tang, Wei, et al.
Published: (2025)
SpotAgent: Grounding Visual Geo-localization in Large Vision-Language Models through Agentic Reasoning
by: Jia, Furong, et al.
Published: (2026)
by: Jia, Furong, et al.
Published: (2026)
Aria-UI: Visual Grounding for GUI Instructions
by: Yang, Yuhao, et al.
Published: (2024)
by: Yang, Yuhao, et al.
Published: (2024)
TruthLens: Visual Grounding for Universal DeepFake Reasoning
by: Kundu, Rohit, et al.
Published: (2025)
by: Kundu, Rohit, et al.
Published: (2025)
AutoFed: Personalized Federated Traffic Prediction via Adaptive Prompt
by: Zhao, Zijian, et al.
Published: (2025)
by: Zhao, Zijian, et al.
Published: (2025)
EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models
by: Villa, Andrés, et al.
Published: (2025)
by: Villa, Andrés, et al.
Published: (2025)
Enhancing Radiology Report Generation and Visual Grounding using Reinforcement Learning
by: Gundersen, Benjamin, et al.
Published: (2025)
by: Gundersen, Benjamin, et al.
Published: (2025)
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning
by: Xu, Ziqiang, et al.
Published: (2025)
by: Xu, Ziqiang, et al.
Published: (2025)
Generative AI-Enhanced Cooperative MEC of UAVs and Ground Stations for Unmanned Surface Vehicles
by: You, Jiahao, et al.
Published: (2025)
by: You, Jiahao, et al.
Published: (2025)
Quantum-Enhanced Adversarial Robustness in Artificial Intelligence
by: Sen, Jaydip
Published: (2026)
by: Sen, Jaydip
Published: (2026)
Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions
by: Liu, Quan, et al.
Published: (2024)
by: Liu, Quan, et al.
Published: (2024)
HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning
by: Książek, Kamil, et al.
Published: (2023)
by: Książek, Kamil, et al.
Published: (2023)
INVIGORATE: Interactive Visual Grounding and Grasping in Clutter
by: Zhang, Hanbo, et al.
Published: (2021)
by: Zhang, Hanbo, et al.
Published: (2021)
OpenGrok: Enhancing SNS Data Processing with Distilled Knowledge and Mask-like Mechanisms
by: AI, Lumen, et al.
Published: (2025)
by: AI, Lumen, et al.
Published: (2025)
Grounding and Enhancing Informativeness and Utility in Dataset Distillation
by: Wang, Shaobo, et al.
Published: (2026)
by: Wang, Shaobo, et al.
Published: (2026)
Residual Tokens Enhance Masked Autoencoders for Speech Modeling
by: Sadok, Samir, et al.
Published: (2026)
by: Sadok, Samir, et al.
Published: (2026)
Think with Grounding: Curriculum Reinforced Reasoning with Video Grounding for Long Video Understanding
by: Chen, Houlun, et al.
Published: (2026)
by: Chen, Houlun, et al.
Published: (2026)
ADMFormer: An Adaptive-Decomposition Transformer with Time-Varying Masked Spatial Attention for Traffic Forecasting
by: Gu, Ruiwen, et al.
Published: (2026)
by: Gu, Ruiwen, et al.
Published: (2026)
SMH-Bench: Benchmarking LLM Agents for Environment-Grounded Reasoning and Action in Smart Homes
by: Li, Kuan, et al.
Published: (2026)
by: Li, Kuan, et al.
Published: (2026)
ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis
by: Li, Lei, et al.
Published: (2025)
by: Li, Lei, et al.
Published: (2025)
Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding
by: Huy, Ta Duc, et al.
Published: (2025)
by: Huy, Ta Duc, et al.
Published: (2025)
Similar Items
-
\textsc{GUI-Spotlight}: Adaptive Iterative Focus Refinement for Enhanced GUI Visual Grounding
by: Lei, Bin, et al.
Published: (2025) -
Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning
by: Yuan, Xinbin, et al.
Published: (2025) -
Activating Visual Context and Commonsense Reasoning through Masked Prediction in VLMs
by: Yu, Jiaao, et al.
Published: (2025) -
Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement
by: Malik, Ashish, et al.
Published: (2026) -
FlashVTG: Feature Layering and Adaptive Score Handling Network for Video Temporal Grounding
by: Cao, Zhuo, et al.
Published: (2024)