:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Salgado, Alberto G. Rodríguez
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.13825
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards Understanding Unsafe Video Generation
by: Pang, Yan, et al.
Published: (2024)

PANC: Prior-Aware Normalized Cut via Anchor-Augmented Token Graphs
by: Gutiérrez, Juan, et al.
Published: (2026)

Safe Vision-Language Models via Unsafe Weights Manipulation
by: D'Incà, Moreno, et al.
Published: (2025)

Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?
by: He, Jingtao, et al.
Published: (2026)

HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios
by: Pu, Jiayue, et al.
Published: (2026)

AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors
by: Zhang, Junyang, et al.
Published: (2025)

Context Matters: Peer-Aware Student Behavioral Engagement Measurement via VLM Action Parsing and LLM Sequence Classification
by: Abdelkawy, Ahmed, et al.
Published: (2026)

AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation
by: Zhang, Jian, et al.
Published: (2026)

SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing
by: Zhu, Hongguang, et al.
Published: (2025)

Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection
by: Fu, Jinhu, et al.
Published: (2026)

MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
by: Pham, Trong-Thang, et al.
Published: (2026)

Bi-Anchor Interpolation Solver for Accelerating Generative Modeling
by: Chen, Hongxu, et al.
Published: (2026)

iPay: Integrated Payment Action Recognition via Multimodal Networks and Adaptive Spatial Prior Learning
by: Huang, Kaicong, et al.
Published: (2026)

DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models
by: Hao, Yuhan, et al.
Published: (2025)

PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models
by: Yuan, Lingzhi, et al.
Published: (2025)

GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision
by: Xiang, Yuxiao, et al.
Published: (2025)

RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving
by: Wang, Yujin, et al.
Published: (2025)

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
by: Guo, Jun, et al.
Published: (2026)

Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
by: Benavent-Lledo, Manuel, et al.
Published: (2024)

Inline Critic Steers Image Editing
by: Kang, Weitai, et al.
Published: (2026)

InstrAct: Towards Action-Centric Understanding in Instructional Videos
by: Yang, Zhuoyi, et al.
Published: (2026)

ReGenNet: Towards Human Action-Reaction Synthesis
by: Xu, Liang, et al.
Published: (2024)

Beyond the Safety Tax: Mitigating Unsafe Text-to-Image Generation via External Safety Rectification
by: Meng, Xiangtao, et al.
Published: (2025)

Beyond Fixed Anchors: Precisely Erasing Concepts with Sibling Exclusive Counterparts
by: Zhang, Tong, et al.
Published: (2025)

VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
by: Yang, Xindi, et al.
Published: (2025)

IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting
by: Wang, Hang, et al.
Published: (2024)

Decoding Vision Transformers: the Diffusion Steering Lens
by: Takatsuki, Ryota, et al.
Published: (2025)

AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories
by: Wang, Zun, et al.
Published: (2026)

ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
by: Yang, Yang, et al.
Published: (2026)

PBADet: A One-Stage Anchor-Free Approach for Part-Body Association
by: Gao, Zhongpai, et al.
Published: (2024)

Proxy-Anchor and EVT-Driven Continual Learning Method for Generalized Category Discovery
by: Fathalizadeh, Alireza, et al.
Published: (2025)

PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
by: Xi, Yingjie, et al.
Published: (2025)

Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation
by: Nadeem, Numair, et al.
Published: (2025)

From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors
by: Zhang, Zhengshen, et al.
Published: (2025)

How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM
by: Zha, Jirong, et al.
Published: (2025)

EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
by: Li, Yuan-Ming, et al.
Published: (2024)

EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance
by: Wang, Zun, et al.
Published: (2025)

A-MESS: Anchor based Multimodal Embedding with Semantic Synchronization for Multimodal Intent Recognition
by: Shen, Yaomin, et al.
Published: (2025)

Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction
by: Huang, Po-Hsuan, et al.
Published: (2024)

TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
by: Guo, Chaohong, et al.
Published: (2025)