Saved in:
| Main Authors: | Bao, Han, Huang, Yue, Wang, Xiaoda, Zhang, Zheyuan, Zhou, Yujun, Yang, Carl, Zhang, Xiangliang, Ye, Yanfang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.20042 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models
by: Bao, Han, et al.
Published: (2026)
by: Bao, Han, et al.
Published: (2026)
SPA: Achieving Consensus in LLM Alignment via Self-Priority Optimization
by: Huang, Yue, et al.
Published: (2025)
by: Huang, Yue, et al.
Published: (2025)
Why Semantic Entropy Fails: Geometry-Aware and Calibrated Uncertainty for Policy Optimization
by: Zhang, Zheyuan, et al.
Published: (2026)
by: Zhang, Zheyuan, et al.
Published: (2026)
Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?
by: Huang, Yue, et al.
Published: (2024)
by: Huang, Yue, et al.
Published: (2024)
EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering
by: Huang, Jiatan, et al.
Published: (2026)
by: Huang, Jiatan, et al.
Published: (2026)
Capability-Oriented Training Induced Alignment Risk
by: Zhou, Yujun, et al.
Published: (2026)
by: Zhou, Yujun, et al.
Published: (2026)
Drift-Bench: Diagnosing Cooperative Breakdowns in LLM Agents under Input Faults via Multi-Turn Interaction
by: Bao, Han, et al.
Published: (2026)
by: Bao, Han, et al.
Published: (2026)
Causally-Enhanced Reinforcement Policy Optimization
by: Wang, Xiangqi, et al.
Published: (2025)
by: Wang, Xiangqi, et al.
Published: (2025)
GLEN-Bench: A Graph-Language based Benchmark for Nutritional Health
by: Huang, Jiatan, et al.
Published: (2026)
by: Huang, Jiatan, et al.
Published: (2026)
MAPRO: Recasting Multi-Agent Prompt Optimization as Maximum a Posteriori Inference
by: Zhang, Zheyuan, et al.
Published: (2025)
by: Zhang, Zheyuan, et al.
Published: (2025)
Exploring Multi-Temperature Strategies for Token- and Rollout-Level Control in RLVR
by: Zhuang, Haomin, et al.
Published: (2025)
by: Zhuang, Haomin, et al.
Published: (2025)
Can LLMs Convert Graphs to Text-Attributed Graphs?
by: Wang, Zehong, et al.
Published: (2024)
by: Wang, Zehong, et al.
Published: (2024)
Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study
by: Zhou, Yujun, et al.
Published: (2025)
by: Zhou, Yujun, et al.
Published: (2025)
NG-Router: Graph-Supervised Multi-Agent Collaboration for Nutrition Question Answering
by: Shi, Kaiwen, et al.
Published: (2025)
by: Shi, Kaiwen, et al.
Published: (2025)
CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era
by: Shi, Kaiwen, et al.
Published: (2026)
by: Shi, Kaiwen, et al.
Published: (2026)
Better Datasets Start From RefineLab: Automatic Optimization for High-Quality Dataset Refinement
by: Luo, Xiaonan, et al.
Published: (2025)
by: Luo, Xiaonan, et al.
Published: (2025)
Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs
by: Huang, Yue, et al.
Published: (2026)
by: Huang, Yue, et al.
Published: (2026)
Food4All: A Multi-Agent Framework for Real-time Free Food Discovery with Integrated Nutritional Metadata
by: Yuan, Zhengqing, et al.
Published: (2025)
by: Yuan, Zhengqing, et al.
Published: (2025)
Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings
by: Huang, Yue, et al.
Published: (2024)
by: Huang, Yue, et al.
Published: (2024)
AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering
by: Zhang, Zheyuan, et al.
Published: (2025)
by: Zhang, Zheyuan, et al.
Published: (2025)
My Favorite Streamer is an LLM: Discovering, Bonding, and Co-Creating in AI VTuber Fandom
by: Ye, Jiayi, et al.
Published: (2025)
by: Ye, Jiayi, et al.
Published: (2025)
Dual Optimal: Make Your LLM Peer-like with Dignity
by: Wang, Xiangqi, et al.
Published: (2026)
by: Wang, Xiangqi, et al.
Published: (2026)
Evaluating and Mitigating Bias in AI-Based Medical Text Generation
by: Chen, Xiuying, et al.
Published: (2025)
by: Chen, Xiuying, et al.
Published: (2025)
SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models
by: Xu, Zixiang, et al.
Published: (2025)
by: Xu, Zixiang, et al.
Published: (2025)
When AI Settles Down: Late-Stage Stability as a Signature of AI-Generated Text Detection
by: Sun, Ke, et al.
Published: (2026)
by: Sun, Ke, et al.
Published: (2026)
ProbeLLM: Automating Principled Diagnosis of LLM Failures
by: Huang, Yue, et al.
Published: (2026)
by: Huang, Yue, et al.
Published: (2026)
A Combinatorial Approach to Neural Emergent Communication
by: Zhang, Zheyuan
Published: (2024)
by: Zhang, Zheyuan
Published: (2024)
1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?
by: Huang, Yue, et al.
Published: (2024)
by: Huang, Yue, et al.
Published: (2024)
SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark
by: Liang, Zhenwen, et al.
Published: (2024)
by: Liang, Zhenwen, et al.
Published: (2024)
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models
by: Xu, Zixiang, et al.
Published: (2025)
by: Xu, Zixiang, et al.
Published: (2025)
Evaluating Large Language Models with Psychometrics
by: Li, Yuan, et al.
Published: (2024)
by: Li, Yuan, et al.
Published: (2024)
Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search
by: Wang, Yanbo, et al.
Published: (2025)
by: Wang, Yanbo, et al.
Published: (2025)
ChemOrch: Empowering LLMs with Chemical Intelligence via Synthetic Instructions
by: Huang, Yue, et al.
Published: (2025)
by: Huang, Yue, et al.
Published: (2025)
EfficientLLM: Efficiency in Large Language Models
by: Yuan, Zhengqing, et al.
Published: (2025)
by: Yuan, Zhengqing, et al.
Published: (2025)
Enhance Graph Alignment for Large Language Models
by: Luo, Haitong, et al.
Published: (2024)
by: Luo, Haitong, et al.
Published: (2024)
EvoTaxo: Building and Evolving Taxonomy from Social Media Streams
by: Li, Yiyang, et al.
Published: (2026)
by: Li, Yiyang, et al.
Published: (2026)
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
by: Zhou, Yujun, et al.
Published: (2025)
by: Zhou, Yujun, et al.
Published: (2025)
Decoupling Content and Expression: Two-Dimensional Detection of AI-Generated Text
by: Bao, Guangsheng, et al.
Published: (2025)
by: Bao, Guangsheng, et al.
Published: (2025)
Personality Alignment of Large Language Models
by: Zhu, Minjun, et al.
Published: (2024)
by: Zhu, Minjun, et al.
Published: (2024)
NARRA-Gym for Evaluating Interactive Narrative Agents
by: Huang, Yue, et al.
Published: (2026)
by: Huang, Yue, et al.
Published: (2026)
Similar Items
-
PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models
by: Bao, Han, et al.
Published: (2026) -
SPA: Achieving Consensus in LLM Alignment via Self-Priority Optimization
by: Huang, Yue, et al.
Published: (2025) -
Why Semantic Entropy Fails: Geometry-Aware and Calibrated Uncertainty for Policy Optimization
by: Zhang, Zheyuan, et al.
Published: (2026) -
Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?
by: Huang, Yue, et al.
Published: (2024) -
EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering
by: Huang, Jiatan, et al.
Published: (2026)