:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Fengqing, Li, Yuetai, Feng, Yichen, Zheng, Kaiyuan, Niu, Luyao, Ramasubramanian, Bhaskar, Alomair, Basel, Bushnell, Linda, Poovendran, Radha
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.13690
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Polyhedral Instability Governs Regret in Online Learning
by: Li, Yuetai, et al.
Published: (2026)

BadScientist: Can a Research Agent Write Convincing but Unsound Papers that Fool LLM Reviewers?
by: Jiang, Fengqing, et al.
Published: (2025)

CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models
by: Li, Yuetai, et al.
Published: (2024)

VisualSphinx: Large-Scale Synthetic Vision Logic Puzzles for RL
by: Feng, Yichen, et al.
Published: (2025)

TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning
by: Xu, Zhangchen, et al.
Published: (2025)

Temporal Sampling for Forgotten Reasoning in LLMs
by: Li, Yuetai, et al.
Published: (2025)

Small Models Struggle to Learn from Strong Reasoners
by: Li, Yuetai, et al.
Published: (2025)

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?
by: Feng, Yichen, et al.
Published: (2026)

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
by: Jiang, Fengqing, et al.
Published: (2024)

Distributed Safety-Critical Control of Multi-Agent Systems with Time-Varying Communication Topologies
by: Cheng, Shiyu, et al.
Published: (2026)

Modeling and Designing Non-Pharmaceutical Interventions in Epidemics: A Submodular Approach
by: Cheng, Shiyu, et al.
Published: (2024)

Swarm-STL: A Framework for Motion Planning in Large-Scale, Multi-Swarm Systems
by: Cheng, Shiyu, et al.
Published: (2025)

SoSBench: Benchmarking Safety Alignment on Six Scientific Domains
by: Jiang, Fengqing, et al.
Published: (2025)

A Method for Fast Autonomy Transfer in Reinforcement Learning
by: Sahabandu, Dinuka, et al.
Published: (2024)

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
by: Xiang, Zhen, et al.
Published: (2024)

Who is Responsible? Explaining Safety Violations in Multi-Agent Cyber-Physical Systems
by: Niu, Luyao, et al.
Published: (2024)

Brave: Byzantine-Resilient and Privacy-Preserving Peer-to-Peer Federated Learning
by: Xu, Zhangchen, et al.
Published: (2024)

SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
by: Jiang, Fengqing, et al.
Published: (2025)

JobBench: Aligning Agent Work With Human Will
by: Li, Yuetai, et al.
Published: (2026)

Stronger Models are NOT Stronger Teachers for Instruction Tuning
by: Xu, Zhangchen, et al.
Published: (2024)

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates
by: Jiang, Fengqing, et al.
Published: (2024)

Game of Trojans: Adaptive Adversaries Against Output-based Trojaned-Model Detectors
by: Sahabandu, Dinuka, et al.
Published: (2024)

ACE: A Model Poisoning Attack on Contribution Evaluation Methods in Federated Learning
by: Xu, Zhangchen, et al.
Published: (2024)

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
by: Xu, Zhangchen, et al.
Published: (2024)

Fault Tolerant Neural Control Barrier Functions for Robotic Systems under Sensor Faults and Attacks
by: Zhang, Hongchao, et al.
Published: (2024)

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
by: Xu, Zhangchen, et al.
Published: (2024)

CANTXSec: A Deterministic Intrusion Detection and Prevention System for CAN Bus Monitoring ECU Activations
by: Donadel, Denis, et al.
Published: (2025)

Double-Dip: Thwarting Label-Only Membership Inference Attacks with Transfer Learning and Randomization
by: Rajabi, Arezoo, et al.
Published: (2024)

A Compositional Resilience Index for Computationally Efficient Safety Analysis of Interconnected Systems
by: Niu, Luyao, et al.
Published: (2023)

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
by: Huan, Maggie, et al.
Published: (2025)

The Widths of Strict Outerconfluent Graphs
by: Eppstein, David
Published: (2023)

Simulating Environments with Reasoning Models for Agent Training
by: Li, Yuetai, et al.
Published: (2025)

Preventing Prompt Injection with Type-Directed Privilege Separation
by: Jacob, Dennis, et al.
Published: (2025)

Defending Against Prompt Injection with DataFilter
by: Wang, Yizhu, et al.
Published: (2025)

PromptShield: Deployable Detection for Prompt Injection Attacks
by: Jacob, Dennis, et al.
Published: (2025)

Width Hierarchy for k-OBDD of Small Width
by: Khadiev, Kamil
Published: (2015)

StyleRF-VolVis: Style Transfer of Neural Radiance Fields for Expressive Volume Visualization
by: Tang, Kaiyuan, et al.
Published: (2024)

Bisection Width, Discrepancy, and Eigenvalues of Hypergraphs
by: Räty, Eero, et al.
Published: (2024)

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
by: Xu, Zhangchen, et al.
Published: (2025)

The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest
by: Lyu, Shen-Huan, et al.
Published: (2024)