:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, ShiYing, Lin, Liang, Li, Yuer, Luo, Kaiwen, Zhou, Zhenhong, Zhang, An, Dong, Junhao, Wang, Kun, Zeng, Zhigang
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.11679
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CSSBench: Evaluating the Safety of Lightweight LLMs against Chinese-Specific Adversarial Patterns
by: Zhou, Zhenhong, et al.
Published: (2026)

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs
by: Lin, Liang, et al.
Published: (2026)

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
by: Zhou, Zhenhong, et al.
Published: (2024)

MAD-OPD: Breaking the Ceiling in On-Policy Distillation via Multi-Agent Debate
by: Wang, Jianze, et al.
Published: (2026)

Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space
by: Huang, Yao, et al.
Published: (2025)

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment
by: Wang, Kun, et al.
Published: (2026)

HelpSteer2-Preference: Complementing Ratings with Preferences
by: Wang, Zhilin, et al.
Published: (2024)

Interior Eigensolver Based on Rational Filter with Composite rule
by: Chen, Yuer, et al.
Published: (2023)

Course-Correction: Safety Alignment Using Synthetic Preferences
by: Xu, Rongwu, et al.
Published: (2024)

CeRA: Overcoming the Linear Ceiling of Low-Rank Adaptation via Capacity Expansion
by: Chen, Hung-Hsuan
Published: (2026)

Breaking the Compression Ceiling: Data-Free Pipeline for Ultra-Efficient Delta Compression
by: Wang, Xiaohui, et al.
Published: (2025)

Breaking the Performance Ceiling in Reinforcement Learning requires Inference Strategies
by: Chalumeau, Felix, et al.
Published: (2025)

Hit-RAG: Learning to Reason with Long Contexts via Preference Alignment
by: Liu, Junming, et al.
Published: (2026)

ChronosAudio: A Comprehensive Long-Audio Benchmark for Evaluating Audio-Large Language Models
by: Luo, Kaiwen, et al.
Published: (2026)

RSA-Bench: Benchmarking Audio Large Models in Real-World Acoustic Scenarios
by: Zhang, Yibo, et al.
Published: (2026)

HearSay Benchmark: Do Audio LLMs Leak What They Hear?
by: Wang, Jin, et al.
Published: (2026)

Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
by: Liang, Ren-Wei, et al.
Published: (2025)

Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
by: Yuan, Yurun, et al.
Published: (2026)

Pcc-tuning: Breaking the Contrastive Learning Ceiling in Semantic Textual Similarity
by: Zhang, Bowen, et al.
Published: (2024)

Improving 3D Finger Traits Recognition via Generalizable Neural Rendering
by: Xu, Hongbin, et al.
Published: (2024)

Can LLMs Help Decentralized Dispute Arbitration? A Case Study of UMA-Resolved Markets on Polymarket
by: Wen, Junhao, et al.
Published: (2026)

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers
by: Lin, Liang, et al.
Published: (2025)

Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling
by: Yu, Yao-Ching, et al.
Published: (2024)

On the Role of Attention Heads in Large Language Model Safety
by: Zhou, Zhenhong, et al.
Published: (2024)

Attribute-Grounded Selective Reasoning for Artwork Emotion Understanding with Multimodal Large Language Models
by: Zhang, Cheng, et al.
Published: (2026)

Backdoor Collapse: Eliminating Unknown Threats via Known Backdoor Aggregation in Language Models
by: Lin, Liang, et al.
Published: (2025)

Ceiling of Barium Substitution for B‐Site Cation in Organometal Halide Perovskite Solar Cells
by: Kai-Chi Hsiao, et al.
Published: (2024)

Estimation of Riemannian Quantities from Noisy Data via Density Derivatives
by: Chen, Junhao, et al.
Published: (2026)

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
by: Zhang, Wenxuan, et al.
Published: (2024)

MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning
by: Zhang, Yaolun, et al.
Published: (2026)

Tavan / Ceiling
by: petaaerial
Published: (2020)

A Geometric Probe of the Accuracy-Robustness Trade-off: Sharp Boundaries in Symmetry-Breaking Dimensional Expansion
by: Bai, Yu, et al.
Published: (2026)

Explaining Human Preferences via Metrics for Structured 3D Reconstruction
by: Langerman, Jack, et al.
Published: (2025)

Jailbreaking Large Language Diffusion Models: Revealing Hidden Safety Flaws in Diffusion-Based Text Generation
by: Zhang, Yuanhe, et al.
Published: (2025)

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
by: Wang, Zhilin, et al.
Published: (2025)

Metis-SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start
by: Chen, Kun, et al.
Published: (2025)

Balancing Safety and Helpfulness in Healthcare AI Assistants through Iterative Preference Alignment
by: Nghiem, Huy, et al.
Published: (2025)

Does Using Counterfactual Help LLMs Explain Textual Importance in Classification?
by: Tan, Nelvin, et al.
Published: (2025)

ImageVeriBypasser: An image verification code recognition approach based on Convolutional Neural Network
by: Tong Ji, et al.
Published: (2024)

Attention Masks Help Adversarial Attacks to Bypass Safety Detectors
by: Shi, Yunfan
Published: (2024)