Saved in:
| Main Authors: | Huang, Xinmiao, Hu, Jinwei, Roy, Rajarshi, Wu, Changshun, Dong, Yi, Huang, Xiaowei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.06455 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage
by: Hu, Jinwei, et al.
Published: (2026)
by: Hu, Jinwei, et al.
Published: (2026)
Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
by: Hu, Jinwei, et al.
Published: (2025)
by: Hu, Jinwei, et al.
Published: (2025)
Responsible Agentic AI Requires Explicit Provenance
by: Hu, Jinwei, et al.
Published: (2026)
by: Hu, Jinwei, et al.
Published: (2026)
Trust-Oriented Adaptive Guardrails for Large Language Models
by: Hu, Jinwei, et al.
Published: (2024)
by: Hu, Jinwei, et al.
Published: (2024)
Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models
by: Li, Zhuoyun, et al.
Published: (2026)
by: Li, Zhuoyun, et al.
Published: (2026)
Chain-of-Thought as a Lens: Evaluating Structured Reasoning Alignment between Human Preferences and Large Language Models
by: Wang, Boxuan, et al.
Published: (2025)
by: Wang, Boxuan, et al.
Published: (2025)
Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation
by: He, Qisong, et al.
Published: (2026)
by: He, Qisong, et al.
Published: (2026)
DDSA: Dual-Domain Strategic Attack for Spatial-Temporal Efficiency in Adversarial Robustness Testing
by: Hu, Jinwei, et al.
Published: (2026)
by: Hu, Jinwei, et al.
Published: (2026)
Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming
by: Liu, Jiaxu, et al.
Published: (2024)
by: Liu, Jiaxu, et al.
Published: (2024)
Position: Towards a Responsible LLM-empowered Multi-Agent Systems
by: Hu, Jinwei, et al.
Published: (2025)
by: Hu, Jinwei, et al.
Published: (2025)
Safeguarding Large Language Models: A Survey
by: Dong, Yi, et al.
Published: (2024)
by: Dong, Yi, et al.
Published: (2024)
FragileFlow: Spectral Control of Correct-but-Fragile Predictions for Foundation Model Robustness
by: Li, Zhuoyun, et al.
Published: (2026)
by: Li, Zhuoyun, et al.
Published: (2026)
Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts
by: Wang, Boxuan, et al.
Published: (2026)
by: Wang, Boxuan, et al.
Published: (2026)
Revisiting Out-of-Distribution Detection in Real-time Object Detection: From Benchmark Pitfalls to a New Mitigation Paradigm
by: Wu, Changshun, et al.
Published: (2025)
by: Wu, Changshun, et al.
Published: (2025)
FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model
by: Hu, Jinwei, et al.
Published: (2025)
by: Hu, Jinwei, et al.
Published: (2025)
Grounded Continuation: A Linear-Time Runtime Verifier for LLM Conversations
by: He, Qisong, et al.
Published: (2026)
by: He, Qisong, et al.
Published: (2026)
Hierarchical Testing with Rabbit Optimization for Industrial Cyber-Physical Systems
by: Hu, Jinwei, et al.
Published: (2025)
by: Hu, Jinwei, et al.
Published: (2025)
ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety
by: Wang, Haoyu, et al.
Published: (2025)
by: Wang, Haoyu, et al.
Published: (2025)
PrefixAgent: An LLM-Powered Design Framework for Efficient Prefix Adder Optimization
by: Zuo, Dongsheng, et al.
Published: (2025)
by: Zuo, Dongsheng, et al.
Published: (2025)
What, Indeed, is an Achievable Provable Guarantee for Learning-Enabled Safety Critical Systems
by: Bensalem, Saddek, et al.
Published: (2023)
by: Bensalem, Saddek, et al.
Published: (2023)
BenchGuard: Who Guards the Benchmarks? Automated Auditing of LLM Agent Benchmarks
by: Tu, Xinming, et al.
Published: (2026)
by: Tu, Xinming, et al.
Published: (2026)
PrefixLLM: LLM-aided Prefix Circuit Design
by: Xiao, Weihua, et al.
Published: (2024)
by: Xiao, Weihua, et al.
Published: (2024)
Rethinking Multi-Agent Intelligence Through the Lens of Small-World Networks
by: Wang, Boxuan, et al.
Published: (2025)
by: Wang, Boxuan, et al.
Published: (2025)
TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models
by: Yin, Xiangyu, et al.
Published: (2025)
by: Yin, Xiangyu, et al.
Published: (2025)
Building Guardrails for Large Language Models
by: Dong, Yi, et al.
Published: (2024)
by: Dong, Yi, et al.
Published: (2024)
RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents
by: Xiao, Wenjie, et al.
Published: (2026)
by: Xiao, Wenjie, et al.
Published: (2026)
From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation
by: Wang, Jiahao, et al.
Published: (2026)
by: Wang, Jiahao, et al.
Published: (2026)
PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol
by: Nguyen, Khanh Linh, et al.
Published: (2026)
by: Nguyen, Khanh Linh, et al.
Published: (2026)
Where LLM Agents Fail and How They can Learn From Failures
by: Zhu, Kunlun, et al.
Published: (2025)
by: Zhu, Kunlun, et al.
Published: (2025)
Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention
by: Wang, Zhiming, et al.
Published: (2026)
by: Wang, Zhiming, et al.
Published: (2026)
Counterfactual Trace Auditing of LLM Agent Skills
by: Zhou, Xiaolin, et al.
Published: (2026)
by: Zhou, Xiaolin, et al.
Published: (2026)
RoS-Guard: Robust and Scalable Online Change Detection with Delay-Optimal Guarantees
by: Zhu, Zelin, et al.
Published: (2025)
by: Zhu, Zelin, et al.
Published: (2025)
Tapas Are Free! Training-Free Adaptation of Programmatic Agents via LLM-Guided Program Synthesis in Dynamic Environments
by: Hu, Jinwei, et al.
Published: (2025)
by: Hu, Jinwei, et al.
Published: (2025)
Reachability Verification Based Reliability Assessment for Deep Reinforcement Learning Controlled Robotics and Autonomous Systems
by: Dong, Yi, et al.
Published: (2022)
by: Dong, Yi, et al.
Published: (2022)
Safe Pruning LoRA: Robust Distance-Guided Pruning for Safety Alignment in Adaptation of LLMs
by: Ao, Shuang, et al.
Published: (2025)
by: Ao, Shuang, et al.
Published: (2025)
SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
by: Huang, Zhenglin, et al.
Published: (2024)
by: Huang, Zhenglin, et al.
Published: (2024)
BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents
by: Huang, Jiahao, et al.
Published: (2026)
by: Huang, Jiahao, et al.
Published: (2026)
What is Formal Verification without Specifications? A Survey on mining LTL Specifications
by: Neider, Daniel, et al.
Published: (2025)
by: Neider, Daniel, et al.
Published: (2025)
A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory
by: Wei, Qianshan, et al.
Published: (2025)
by: Wei, Qianshan, et al.
Published: (2025)
Similar Items
-
Lying with Truths: Open-Channel Multi-Agent Collusion for Belief Manipulation via Generative Montage
by: Hu, Jinwei, et al.
Published: (2026) -
Enhancing Robustness of LLM-Driven Multi-Agent Systems through Randomized Smoothing
by: Hu, Jinwei, et al.
Published: (2025) -
Responsible Agentic AI Requires Explicit Provenance
by: Hu, Jinwei, et al.
Published: (2026) -
Trust-Oriented Adaptive Guardrails for Large Language Models
by: Hu, Jinwei, et al.
Published: (2024) -
Where Do Prompt Perturbations Break Generation? A Segment-Level View of Robustness in LoRA-Tuned Language Models
by: Li, Zhuoyun, et al.
Published: (2026)