:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Xinmiao, Hu, Jinwei, Roy, Rajarshi, Wu, Changshun, Dong, Yi, Huang, Xiaowei
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.06455
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage
by: Hu, Jinwei, et al.
Published: (2026)

Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
by: Hu, Jinwei, et al.
Published: (2025)

Responsible Agentic AI Requires Explicit Provenance
by: Hu, Jinwei, et al.
Published: (2026)

Trust-Oriented Adaptive Guardrails for Large Language Models
by: Hu, Jinwei, et al.
Published: (2024)

Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models
by: Li, Zhuoyun, et al.
Published: (2026)

Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models
by: Wang, Boxuan, et al.
Published: (2025)

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation
by: He, Qisong, et al.
Published: (2026)

DDSA: Dual-Domain Strategic Attack for Spatial-Temporal Efficiency in Adversarial Robustness Testing
by: Hu, Jinwei, et al.
Published: (2026)

Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming
by: Liu, Jiaxu, et al.
Published: (2024)

Position: Towards a Responsible LLM-empowered Multi-Agent Systems
by: Hu, Jinwei, et al.
Published: (2025)

Safeguarding Large Language Models: A Survey
by: Dong, Yi, et al.
Published: (2024)

FragileFlow: Spectral Control of Correct-but-Fragile Predictions for Foundation Model Robustness
by: Li, Zhuoyun, et al.
Published: (2026)

Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts
by: Wang, Boxuan, et al.
Published: (2026)

Revisiting Out-of-Distribution Detection in Real-time Object Detection: From Benchmark Pitfalls to a New Mitigation Paradigm
by: Wu, Changshun, et al.
Published: (2025)

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
by: Hu, Jinwei, et al.
Published: (2025)

Grounded Continuation: A Linear-Time Runtime Verifier for LLM Conversations
by: He, Qisong, et al.
Published: (2026)

Hierarchical Testing with Rabbit Optimization for Industrial Cyber-Physical Systems
by: Hu, Jinwei, et al.
Published: (2025)

ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety
by: Wang, Haoyu, et al.
Published: (2025)

PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization
by: Zuo, Dongsheng, et al.
Published: (2025)

What, Indeed, is an Achievable Provable Guarantee for Learning-Enabled Safety Critical Systems
by: Bensalem, Saddek, et al.
Published: (2023)

BenchGuard: Who Guards the Benchmarks? Automated Auditing of LLM Agent Benchmarks
by: Tu, Xinming, et al.
Published: (2026)

PrefixLLM: LLM-aided Prefix Circuit Design
by: Xiao, Weihua, et al.
Published: (2024)

Rethinking Multi-Agent Intelligence Through the Lens of Small-World Networks
by: Wang, Boxuan, et al.
Published: (2025)

TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models
by: Yin, Xiangyu, et al.
Published: (2025)

Building Guardrails for Large Language Models
by: Dong, Yi, et al.
Published: (2024)

RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents
by: Xiao, Wenjie, et al.
Published: (2026)

From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation
by: Wang, Jiahao, et al.
Published: (2026)

PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
by: Wang, Haonan, et al.
Published: (2025)

TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol
by: Nguyen, Khanh Linh, et al.
Published: (2026)

Where LLM Agents Fail and How They can Learn From Failures
by: Zhu, Kunlun, et al.
Published: (2025)

Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention
by: Wang, Zhiming, et al.
Published: (2026)

Counterfactual Trace Auditing of LLM Agent Skills
by: Zhou, Xiaolin, et al.
Published: (2026)

RoS-Guard: Robust and Scalable Online Change Detection with Delay-Optimal Guarantees
by: Zhu, Zelin, et al.
Published: (2025)

Tapas Are Free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments
by: Hu, Jinwei, et al.
Published: (2025)

Reachability Verification Based Reliability Assessment for Deep Reinforcement Learning Controlled Robotics and Autonomous Systems
by: Dong, Yi, et al.
Published: (2022)

Safe Pruning LoRA: Robust Distance-Guided Pruning for Safety Alignment in Adaptation of LLMs
by: Ao, Shuang, et al.
Published: (2025)

SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
by: Huang, Zhenglin, et al.
Published: (2024)

BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents
by: Huang, Jiahao, et al.
Published: (2026)

What is Formal Verification without Specifications? A Survey on mining LTL Specifications
by: Neider, Daniel, et al.
Published: (2025)

A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory
by: Wei, Qianshan, et al.
Published: (2025)