:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pan, Bo, Kan, Xuan, Zhang, Kaitai, Yan, Yan, Tan, Shunwen, He, Zihao, Ding, Zixin, Wu, Junjie, Zhao, Liang
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.11340
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multi-Task Reinforcement Learning for Enhanced Multimodal LLM-as-a-Judge
by: Wu, Junjie, et al.
Published: (2026)

RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following
by: Pan, Tianjun, et al.
Published: (2026)

Optimization-based Prompt Injection Attack to LLM-as-a-Judge
by: Shi, Jiawen, et al.
Published: (2024)

Auto-Prompt Ensemble for LLM Judge
by: Li, Jiajie, et al.
Published: (2025)

Can Past Experience Accelerate LLM Reasoning?
by: Pan, Bo, et al.
Published: (2025)

Bi-Level Optimization for Single Domain Generalization
by: Heidari, Marzi, et al.
Published: (2026)

Informed Routing in LLMs: Smarter Token-Level Computation for Faster Inference
by: Han, Chao, et al.
Published: (2025)

Rubrics as an Attack Surface: Stealthy Preference Drift in LLM Judges
by: Ding, Ruomeng, et al.
Published: (2026)

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
by: Chen, Dongping, et al.
Published: (2024)

Who's Your Judge? On the Detectability of LLM-Generated Judgments
by: Li, Dawei, et al.
Published: (2025)

Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge
by: Shi, Lin, et al.
Published: (2024)

A More Advanced Group Polarization Measurement Approach Based on LLM-Based Agents and Graphs
by: Liu, Zixin, et al.
Published: (2024)

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
by: Luo, Haotian, et al.
Published: (2025)

Instructional Prompt Optimization for Few-Shot LLM-Based Recommendations on Cold-Start Users
by: Yang, Haowei, et al.
Published: (2025)

LLM and GNN are Complementary: Distilling LLM for Multimodal Graph Learning
by: Xu, Junjie, et al.
Published: (2024)

Judging with Many Minds: Do More Perspectives Mean Less Prejudice? On Bias Amplifications and Resistance in Multi-Agent Based LLM-as-Judge
by: Ma, Chiyu, et al.
Published: (2025)

Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
by: Koo, Hamin, et al.
Published: (2025)

Explainable and Fine-Grained Safeguarding of LLM Multi-Agent Systems via Bi-Level Graph Anomaly Detection
by: Pan, Junjun, et al.
Published: (2025)

JudgeFlow: Agentic Workflow Optimization via Block Judge
by: Ma, Zihan, et al.
Published: (2026)

Unifying Temporal and Structural Credit Assignment in LLM-Based Multi-Agent Prompt Optimization
by: Li, Wenwu, et al.
Published: (2026)

Bi-Level Policy Optimization with Nyström Hypergradients
by: Prakash, Arjun, et al.
Published: (2025)

Overconfidence in LLM-as-a-Judge: Diagnosis and Confidence-Driven Solution
by: Tian, Zailong, et al.
Published: (2025)

Multi-Agent Debate for LLM Judges with Adaptive Stability Detection
by: Hu, Tianyu, et al.
Published: (2025)

ONRW: Optimizing inversion noise for high-quality and robust watermark
by: Ding, Xuan, et al.
Published: (2026)

CodeJudgeBench: Benchmarking LLM-as-a-Judge for Coding Tasks
by: Jiang, Hongchao, et al.
Published: (2025)

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents
by: Wang, Zihao, et al.
Published: (2025)

BiVRec: Bidirectional View-based Multimodal Sequential Recommendation
by: Hu, Jiaxi, et al.
Published: (2024)

Character-Level Perturbations Disrupt LLM Watermarks
by: Zhang, Zhaoxi, et al.
Published: (2025)

BadJudge: Backdoor Vulnerabilities of LLM-as-a-Judge
by: Tong, Terry, et al.
Published: (2025)

JudgeSQL: Reasoning over SQL Candidates with Weighted Consensus Tournament
by: Bai, Jiayuan, et al.
Published: (2025)

EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs
by: Zhao, Xiangyu, et al.
Published: (2023)

MicroWorld: Empowering Multimodal Large Language Models to Bridge the Microscopic Domain Gap with Multimodal Attribute Graph
by: Li, Manyu, et al.
Published: (2026)

LLM-Assisted Op-Amp Behavioral-Level Design via Agentic Human-Mimicking Reasoning
by: Chen, Zihao, et al.
Published: (2026)

JudgeBench: A Benchmark for Evaluating LLM-based Judges
by: Tan, Sijun, et al.
Published: (2024)

BadPromptFL: A Novel Backdoor Threat to Prompt-based Federated Learning in Multimodal Models
by: Zhang, Maozhen, et al.
Published: (2025)

Criterion Validity of LLM-as-Judge for Business Outcomes in Conversational Commerce
by: Chen, Liang, et al.
Published: (2026)

Diagnosing Live Within-Policy Instruction Conflicts in LLM Agents with Witnessed Resolution Profiles
by: Yan, Lu, et al.
Published: (2026)

EASE: Federated Multimodal Unlearning via Entanglement-Aware Anchor Closure
by: Ding, Zihao, et al.
Published: (2026)

An Adaptive Differentially Private Federated Learning Framework with Bi-level Optimization
by: Wang, Jin, et al.
Published: (2026)

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
by: Xu, Ran, et al.
Published: (2025)