Saved in:
| Main Authors: | Jiang, Fengqing, Li, Yuetai, Feng, Yichen, Zheng, Kaiyuan, Niu, Luyao, Ramasubramanian, Bhaskar, Alomair, Basel, Bushnell, Linda, Poovendran, Radha |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.13690 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Polyhedral Instability Governs Regret in Online Learning
by: Li, Yuetai, et al.
Published: (2026)
by: Li, Yuetai, et al.
Published: (2026)
BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?
by: Jiang, Fengqing, et al.
Published: (2025)
by: Jiang, Fengqing, et al.
Published: (2025)
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
by: Li, Yuetai, et al.
Published: (2024)
by: Li, Yuetai, et al.
Published: (2024)
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL
by: Feng, Yichen, et al.
Published: (2025)
by: Feng, Yichen, et al.
Published: (2025)
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning
by: Xu, Zhangchen, et al.
Published: (2025)
by: Xu, Zhangchen, et al.
Published: (2025)
Temporal Sampling for Forgotten Reasoning in LLMs
by: Li, Yuetai, et al.
Published: (2025)
by: Li, Yuetai, et al.
Published: (2025)
Small Models Struggle to Learn from Strong Reasoners
by: Li, Yuetai, et al.
Published: (2025)
by: Li, Yuetai, et al.
Published: (2025)
Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?
by: Feng, Yichen, et al.
Published: (2026)
by: Feng, Yichen, et al.
Published: (2026)
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
by: Jiang, Fengqing, et al.
Published: (2024)
by: Jiang, Fengqing, et al.
Published: (2024)
Distributed Safety-Critical Control of Multi-Agent Systems with Time-Varying Communication Topologies
by: Cheng, Shiyu, et al.
Published: (2026)
by: Cheng, Shiyu, et al.
Published: (2026)
Modeling and Designing Non-Pharmaceutical Interventions in Epidemics: A Submodular Approach
by: Cheng, Shiyu, et al.
Published: (2024)
by: Cheng, Shiyu, et al.
Published: (2024)
Swarm-STL: A Framework for Motion Planning in Large-Scale, Multi-Swarm Systems
by: Cheng, Shiyu, et al.
Published: (2025)
by: Cheng, Shiyu, et al.
Published: (2025)
SoSBench: Benchmarking Safety Alignment on Six Scientific Domains
by: Jiang, Fengqing, et al.
Published: (2025)
by: Jiang, Fengqing, et al.
Published: (2025)
A Method for Fast Autonomy Transfer in Reinforcement Learning
by: Sahabandu, Dinuka, et al.
Published: (2024)
by: Sahabandu, Dinuka, et al.
Published: (2024)
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
by: Xiang, Zhen, et al.
Published: (2024)
by: Xiang, Zhen, et al.
Published: (2024)
Who is Responsible? Explaining Safety Violations in Multi-Agent Cyber-Physical Systems
by: Niu, Luyao, et al.
Published: (2024)
by: Niu, Luyao, et al.
Published: (2024)
Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning
by: Xu, Zhangchen, et al.
Published: (2024)
by: Xu, Zhangchen, et al.
Published: (2024)
SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
by: Jiang, Fengqing, et al.
Published: (2025)
by: Jiang, Fengqing, et al.
Published: (2025)
JobBench: Aligning Agent Work With Human Will
by: Li, Yuetai, et al.
Published: (2026)
by: Li, Yuetai, et al.
Published: (2026)
Stronger Models are NOT Stronger Teachers for Instruction Tuning
by: Xu, Zhangchen, et al.
Published: (2024)
by: Xu, Zhangchen, et al.
Published: (2024)
ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
by: Jiang, Fengqing, et al.
Published: (2024)
by: Jiang, Fengqing, et al.
Published: (2024)
Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors
by: Sahabandu, Dinuka, et al.
Published: (2024)
by: Sahabandu, Dinuka, et al.
Published: (2024)
ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning
by: Xu, Zhangchen, et al.
Published: (2024)
by: Xu, Zhangchen, et al.
Published: (2024)
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
by: Xu, Zhangchen, et al.
Published: (2024)
by: Xu, Zhangchen, et al.
Published: (2024)
Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks
by: Zhang, Hongchao, et al.
Published: (2024)
by: Zhang, Hongchao, et al.
Published: (2024)
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
by: Xu, Zhangchen, et al.
Published: (2024)
by: Xu, Zhangchen, et al.
Published: (2024)
CANTXSec: A Deterministic Intrusion Detection and Prevention System for CAN Bus Monitoring ECU Activations
by: Donadel, Denis, et al.
Published: (2025)
by: Donadel, Denis, et al.
Published: (2025)
Double-Dip: Thwarting Label-Only Membership Inference Attacks with Transfer Learning and Randomization
by: Rajabi, Arezoo, et al.
Published: (2024)
by: Rajabi, Arezoo, et al.
Published: (2024)
A Compositional Resilience Index for Computationally Efficient Safety Analysis of Interconnected Systems
by: Niu, Luyao, et al.
Published: (2023)
by: Niu, Luyao, et al.
Published: (2023)
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
by: Huan, Maggie, et al.
Published: (2025)
by: Huan, Maggie, et al.
Published: (2025)
The Widths of Strict Outerconfluent Graphs
by: Eppstein, David
Published: (2023)
by: Eppstein, David
Published: (2023)
Simulating Environments with Reasoning Models for Agent Training
by: Li, Yuetai, et al.
Published: (2025)
by: Li, Yuetai, et al.
Published: (2025)
Preventing Prompt Injection with Type-Directed Privilege Separation
by: Jacob, Dennis, et al.
Published: (2025)
by: Jacob, Dennis, et al.
Published: (2025)
Defending Against Prompt Injection with DataFilter
by: Wang, Yizhu, et al.
Published: (2025)
by: Wang, Yizhu, et al.
Published: (2025)
PromptShield: Deployable Detection for Prompt Injection Attacks
by: Jacob, Dennis, et al.
Published: (2025)
by: Jacob, Dennis, et al.
Published: (2025)
Width Hierarchy for k-OBDD of Small Width
by: Khadiev, Kamil
Published: (2015)
by: Khadiev, Kamil
Published: (2015)
StyleRF-VolVis: Style Transfer of Neural Radiance Fields for Expressive Volume Visualization
by: Tang, Kaiyuan, et al.
Published: (2024)
by: Tang, Kaiyuan, et al.
Published: (2024)
Bisection Width, Discrepancy, and Eigenvalues of Hypergraphs
by: Räty, Eero, et al.
Published: (2024)
by: Räty, Eero, et al.
Published: (2024)
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
by: Xu, Zhangchen, et al.
Published: (2025)
by: Xu, Zhangchen, et al.
Published: (2025)
The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest
by: Lyu, Shen-Huan, et al.
Published: (2024)
by: Lyu, Shen-Huan, et al.
Published: (2024)
Similar Items
-
Polyhedral Instability Governs Regret in Online Learning
by: Li, Yuetai, et al.
Published: (2026) -
BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?
by: Jiang, Fengqing, et al.
Published: (2025) -
CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
by: Li, Yuetai, et al.
Published: (2024) -
VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL
by: Feng, Yichen, et al.
Published: (2025) -
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning
by: Xu, Zhangchen, et al.
Published: (2025)