Saved in:
| Main Author: | Salgado, Alberto G. Rodríguez |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.13825 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Understanding Unsafe Video Generation
by: Pang, Yan, et al.
Published: (2024)
by: Pang, Yan, et al.
Published: (2024)
PANC: Prior-Aware Normalized Cut via Anchor-Augmented Token Graphs
by: Gutiérrez, Juan, et al.
Published: (2026)
by: Gutiérrez, Juan, et al.
Published: (2026)
Safe Vision-Language Models via Unsafe Weights Manipulation
by: D'Incà, Moreno, et al.
Published: (2025)
by: D'Incà, Moreno, et al.
Published: (2025)
Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?
by: He, Jingtao, et al.
Published: (2026)
by: He, Jingtao, et al.
Published: (2026)
HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios
by: Pu, Jiayue, et al.
Published: (2026)
by: Pu, Jiayue, et al.
Published: (2026)
AttAnchor: Guiding Cross-Modal Token Alignment in VLMs with Attention Anchors
by: Zhang, Junyang, et al.
Published: (2025)
by: Zhang, Junyang, et al.
Published: (2025)
Context Matters: Peer-Aware Student Behavioral Engagement Measurement via VLM Action Parsing and LLM Sequence Classification
by: Abdelkawy, Ahmed, et al.
Published: (2026)
by: Abdelkawy, Ahmed, et al.
Published: (2026)
AnchorDiff: Training-Free Concept Grounding for MM-DiTs via Anchor-Based Graph Propagation
by: Zhang, Jian, et al.
Published: (2026)
by: Zhang, Jian, et al.
Published: (2026)
SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing
by: Zhu, Hongguang, et al.
Published: (2025)
by: Zhu, Hongguang, et al.
Published: (2025)
Diagnosing and Repairing Unsafe Channels in Vision-Language Models via Causal Discovery and Dual-Modal Safety Subspace Projection
by: Fu, Jinhu, et al.
Published: (2026)
by: Fu, Jinhu, et al.
Published: (2026)
MedSteer: Counterfactual Endoscopic Synthesis via Training-Free Activation Steering
by: Pham, Trong-Thang, et al.
Published: (2026)
by: Pham, Trong-Thang, et al.
Published: (2026)
Bi-Anchor Interpolation Solver for Accelerating Generative Modeling
by: Chen, Hongxu, et al.
Published: (2026)
by: Chen, Hongxu, et al.
Published: (2026)
iPay: Integrated Payment Action Recognition via Multimodal Networks and Adaptive Spatial Prior Learning
by: Huang, Kaicong, et al.
Published: (2026)
by: Huang, Kaicong, et al.
Published: (2026)
DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models
by: Hao, Yuhan, et al.
Published: (2025)
by: Hao, Yuhan, et al.
Published: (2025)
PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models
by: Yuan, Lingzhi, et al.
Published: (2025)
by: Yuan, Lingzhi, et al.
Published: (2025)
GuardTrace-VL: Detecting Unsafe Multimodel Reasoning via Iterative Safety Supervision
by: Xiang, Yuxiao, et al.
Published: (2025)
by: Xiang, Yuxiao, et al.
Published: (2025)
RAD: Retrieval-Augmented Decision-Making of Meta-Actions with Vision-Language Models in Autonomous Driving
by: Wang, Yujin, et al.
Published: (2025)
by: Wang, Yujin, et al.
Published: (2025)
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising
by: Guo, Jun, et al.
Published: (2026)
by: Guo, Jun, et al.
Published: (2026)
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
by: Benavent-Lledo, Manuel, et al.
Published: (2024)
by: Benavent-Lledo, Manuel, et al.
Published: (2024)
Inline Critic Steers Image Editing
by: Kang, Weitai, et al.
Published: (2026)
by: Kang, Weitai, et al.
Published: (2026)
InstrAct: Towards Action-Centric Understanding in Instructional Videos
by: Yang, Zhuoyi, et al.
Published: (2026)
by: Yang, Zhuoyi, et al.
Published: (2026)
ReGenNet: Towards Human Action-Reaction Synthesis
by: Xu, Liang, et al.
Published: (2024)
by: Xu, Liang, et al.
Published: (2024)
Beyond the Safety Tax: Mitigating Unsafe Text-to-Image Generation via External Safety Rectification
by: Meng, Xiangtao, et al.
Published: (2025)
by: Meng, Xiangtao, et al.
Published: (2025)
Beyond Fixed Anchors: Precisely Erasing Concepts with Sibling Exclusive Counterparts
by: Zhang, Tong, et al.
Published: (2025)
by: Zhang, Tong, et al.
Published: (2025)
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
by: Yang, Xindi, et al.
Published: (2025)
by: Yang, Xindi, et al.
Published: (2025)
IVAC-P2L: Leveraging Irregular Repetition Priors for Improving Video Action Counting
by: Wang, Hang, et al.
Published: (2024)
by: Wang, Hang, et al.
Published: (2024)
Decoding Vision Transformers: the Diffusion Steering Lens
by: Takatsuki, Ryota, et al.
Published: (2025)
by: Takatsuki, Ryota, et al.
Published: (2025)
AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories
by: Wang, Zun, et al.
Published: (2026)
by: Wang, Zun, et al.
Published: (2026)
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
by: Yang, Yang, et al.
Published: (2026)
by: Yang, Yang, et al.
Published: (2026)
PBADet: A One-Stage Anchor-Free Approach for Part-Body Association
by: Gao, Zhongpai, et al.
Published: (2024)
by: Gao, Zhongpai, et al.
Published: (2024)
Proxy-Anchor and EVT-Driven Continual Learning Method for Generalized Category Discovery
by: Fathalizadeh, Alireza, et al.
Published: (2025)
by: Fathalizadeh, Alireza, et al.
Published: (2025)
PMG: Progressive Motion Generation via Sparse Anchor Postures Curriculum Learning
by: Xi, Yingjie, et al.
Published: (2025)
by: Xi, Yingjie, et al.
Published: (2025)
Segmenting Visuals With Querying Words: Language Anchors For Semi-Supervised Image Segmentation
by: Nadeem, Numair, et al.
Published: (2025)
by: Nadeem, Numair, et al.
Published: (2025)
From Spatial to Actions: Grounding Vision-Language-Action Model in Spatial Foundation Priors
by: Zhang, Zhengshen, et al.
Published: (2025)
by: Zhang, Zhengshen, et al.
Published: (2025)
How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM
by: Zha, Jirong, et al.
Published: (2025)
by: Zha, Jirong, et al.
Published: (2025)
EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding
by: Li, Yuan-Ming, et al.
Published: (2024)
by: Li, Yuan-Ming, et al.
Published: (2024)
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance
by: Wang, Zun, et al.
Published: (2025)
by: Wang, Zun, et al.
Published: (2025)
A-MESS: Anchor based Multimodal Embedding with Semantic Synchronization for Multimodal Intent Recognition
by: Shen, Yaomin, et al.
Published: (2025)
by: Shen, Yaomin, et al.
Published: (2025)
Learning with Instance-Dependent Noisy Labels by Anchor Hallucination and Hard Sample Label Correction
by: Huang, Po-Hsuan, et al.
Published: (2024)
by: Huang, Po-Hsuan, et al.
Published: (2024)
TAR-TVG: Enhancing VLMs with Timestamp Anchor-Constrained Reasoning for Temporal Video Grounding
by: Guo, Chaohong, et al.
Published: (2025)
by: Guo, Chaohong, et al.
Published: (2025)
Similar Items
-
Towards Understanding Unsafe Video Generation
by: Pang, Yan, et al.
Published: (2024) -
PANC: Prior-Aware Normalized Cut via Anchor-Augmented Token Graphs
by: Gutiérrez, Juan, et al.
Published: (2026) -
Safe Vision-Language Models via Unsafe Weights Manipulation
by: D'Incà, Moreno, et al.
Published: (2025) -
Does Visual Information Play a Decisive Role in Vision-Language-Action Model Driving Behavior?
by: He, Jingtao, et al.
Published: (2026) -
HomeSafe-Bench: Evaluating Vision-Language Models on Unsafe Action Detection for Embodied Agents in Household Scenarios
by: Pu, Jiayue, et al.
Published: (2026)