Saved in:
| Main Authors: | Pan, Bo, Kan, Xuan, Zhang, Kaitai, Yan, Yan, Tan, Shunwen, He, Zihao, Ding, Zixin, Wu, Junjie, Zhao, Liang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.11340 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge
by: Wu, Junjie, et al.
Published: (2026)
by: Wu, Junjie, et al.
Published: (2026)
RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following
by: Pan, Tianjun, et al.
Published: (2026)
by: Pan, Tianjun, et al.
Published: (2026)
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
by: Shi, Jiawen, et al.
Published: (2024)
by: Shi, Jiawen, et al.
Published: (2024)
Auto-Prompt Ensemble for LLM Judge
by: Li, Jiajie, et al.
Published: (2025)
by: Li, Jiajie, et al.
Published: (2025)
Can Past Experience Accelerate LLM Reasoning?
by: Pan, Bo, et al.
Published: (2025)
by: Pan, Bo, et al.
Published: (2025)
Bi-Level Optimization for Single Domain Generalization
by: Heidari, Marzi, et al.
Published: (2026)
by: Heidari, Marzi, et al.
Published: (2026)
Informed Routing in LLMs: Smarter Token-Level Computation for Faster Inference
by: Han, Chao, et al.
Published: (2025)
by: Han, Chao, et al.
Published: (2025)
Rubrics as an Attack Surface: Stealthy Preference Drift in LLM Judges
by: Ding, Ruomeng, et al.
Published: (2026)
by: Ding, Ruomeng, et al.
Published: (2026)
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
by: Chen, Dongping, et al.
Published: (2024)
by: Chen, Dongping, et al.
Published: (2024)
Who's Your Judge? On the Detectability of LLM-Generated Judgments
by: Li, Dawei, et al.
Published: (2025)
by: Li, Dawei, et al.
Published: (2025)
Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge
by: Shi, Lin, et al.
Published: (2024)
by: Shi, Lin, et al.
Published: (2024)
A More Advanced Group Polarization Measurement Approach Based on LLM-Based Agents and Graphs
by: Liu, Zixin, et al.
Published: (2024)
by: Liu, Zixin, et al.
Published: (2024)
Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
by: Luo, Haotian, et al.
Published: (2025)
by: Luo, Haotian, et al.
Published: (2025)
Instructional Prompt Optimization for Few-Shot LLM-Based Recommendations on Cold-Start Users
by: Yang, Haowei, et al.
Published: (2025)
by: Yang, Haowei, et al.
Published: (2025)
LLM and GNN are Complementary: Distilling LLM for Multimodal Graph Learning
by: Xu, Junjie, et al.
Published: (2024)
by: Xu, Junjie, et al.
Published: (2024)
Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplifications and Resistance in Multi-Agent Based LLM-as-Judge
by: Ma, Chiyu, et al.
Published: (2025)
by: Ma, Chiyu, et al.
Published: (2025)
Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
by: Koo, Hamin, et al.
Published: (2025)
by: Koo, Hamin, et al.
Published: (2025)
Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection
by: Pan, Junjun, et al.
Published: (2025)
by: Pan, Junjun, et al.
Published: (2025)
JudgeFlow: Agentic Workflow Optimization via Block Judge
by: Ma, Zihan, et al.
Published: (2026)
by: Ma, Zihan, et al.
Published: (2026)
Unifying Temporal and Structural Credit Assignment in LLM-Based Multi-Agent Prompt Optimization
by: Li, Wenwu, et al.
Published: (2026)
by: Li, Wenwu, et al.
Published: (2026)
Bi-Level Policy Optimization with Nyström Hypergradients
by: Prakash, Arjun, et al.
Published: (2025)
by: Prakash, Arjun, et al.
Published: (2025)
Overconfidence in LLM-as-a-Judge: Diagnosis and Confidence-Driven Solution
by: Tian, Zailong, et al.
Published: (2025)
by: Tian, Zailong, et al.
Published: (2025)
Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
by: Hu, Tianyu, et al.
Published: (2025)
by: Hu, Tianyu, et al.
Published: (2025)
ONRW: Optimizing inversion noise for high-quality and robust watermark
by: Ding, Xuan, et al.
Published: (2026)
by: Ding, Xuan, et al.
Published: (2026)
CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks
by: Jiang, Hongchao, et al.
Published: (2025)
by: Jiang, Hongchao, et al.
Published: (2025)
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
by: Wang, Zihao, et al.
Published: (2025)
by: Wang, Zihao, et al.
Published: (2025)
BiVRec: Bidirectional View-based Multimodal Sequential Recommendation
by: Hu, Jiaxi, et al.
Published: (2024)
by: Hu, Jiaxi, et al.
Published: (2024)
Character-Level Perturbations Disrupt LLM Watermarks
by: Zhang, Zhaoxi, et al.
Published: (2025)
by: Zhang, Zhaoxi, et al.
Published: (2025)
BadJudge: Backdoor Vulnerabilities of LLM-as-a-Judge
by: Tong, Terry, et al.
Published: (2025)
by: Tong, Terry, et al.
Published: (2025)
JudgeSQL: Reasoning over SQL Candidates with Weighted Consensus Tournament
by: Bai, Jiayuan, et al.
Published: (2025)
by: Bai, Jiayuan, et al.
Published: (2025)
EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs
by: Zhao, Xiangyu, et al.
Published: (2023)
by: Zhao, Xiangyu, et al.
Published: (2023)
MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph
by: Li, Manyu, et al.
Published: (2026)
by: Li, Manyu, et al.
Published: (2026)
LLM-Assisted Op-Amp Behavioral-Level Design via Agentic Human-Mimicking Reasoning
by: Chen, Zihao, et al.
Published: (2026)
by: Chen, Zihao, et al.
Published: (2026)
JudgeBench: A Benchmark for Evaluating LLM-based Judges
by: Tan, Sijun, et al.
Published: (2024)
by: Tan, Sijun, et al.
Published: (2024)
BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models
by: Zhang, Maozhen, et al.
Published: (2025)
by: Zhang, Maozhen, et al.
Published: (2025)
Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce
by: Chen, Liang, et al.
Published: (2026)
by: Chen, Liang, et al.
Published: (2026)
Diagnosing Live Within-Policy Instruction Conflicts in LLM Agents with Witnessed Resolution Profiles
by: Yan, Lu, et al.
Published: (2026)
by: Yan, Lu, et al.
Published: (2026)
EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure
by: Ding, Zihao, et al.
Published: (2026)
by: Ding, Zihao, et al.
Published: (2026)
An Adaptive Differentially Private Federated Learning Framework with Bi-level Optimization
by: Wang, Jin, et al.
Published: (2026)
by: Wang, Jin, et al.
Published: (2026)
Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
Similar Items
-
Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge
by: Wu, Junjie, et al.
Published: (2026) -
RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following
by: Pan, Tianjun, et al.
Published: (2026) -
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
by: Shi, Jiawen, et al.
Published: (2024) -
Auto-Prompt Ensemble for LLM Judge
by: Li, Jiajie, et al.
Published: (2025) -
Can Past Experience Accelerate LLM Reasoning?
by: Pan, Bo, et al.
Published: (2025)