Saved in:
| Main Authors: | Chen, Zhi-Yuan, Wang, Hao, Zhang, Xinyu, Hu, Enrui, Lin, Yankai |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.02592 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Single-Point Judgment: Distribution Alignment for LLM-as-a-Judge
by: Chen, Luyu, et al.
Published: (2025)
by: Chen, Luyu, et al.
Published: (2025)
Advancing LLM Reasoning Generalists with Preference Trees
by: Yuan, Lifan, et al.
Published: (2024)
by: Yuan, Lifan, et al.
Published: (2024)
Rational Decision-Making Agent with Internalized Utility Judgment
by: Ye, Yining, et al.
Published: (2023)
by: Ye, Yining, et al.
Published: (2023)
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
by: Kim, Dongyoung, et al.
Published: (2024)
by: Kim, Dongyoung, et al.
Published: (2024)
Do LLM Evaluators Prefer Themselves for a Reason?
by: Chen, Wei-Lin, et al.
Published: (2025)
by: Chen, Wei-Lin, et al.
Published: (2025)
Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments
by: Mi, Hao, et al.
Published: (2026)
by: Mi, Hao, et al.
Published: (2026)
Permutative Preference Alignment from Listwise Ranking of Human Judgments
by: Zhao, Yang, et al.
Published: (2024)
by: Zhao, Yang, et al.
Published: (2024)
Improving LLM-as-a-Judge Inference with the Judgment Distribution
by: Wang, Victor, et al.
Published: (2025)
by: Wang, Victor, et al.
Published: (2025)
DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution
by: Fan, Shengda, et al.
Published: (2026)
by: Fan, Shengda, et al.
Published: (2026)
Beyond Guilt: Legal Judgment Prediction with Trichotomous Reasoning
by: Zhang, Kepu, et al.
Published: (2024)
by: Zhang, Kepu, et al.
Published: (2024)
WEPO: Web Element Preference Optimization for LLM-based Web Navigation
by: Liu, Jiarun, et al.
Published: (2024)
by: Liu, Jiarun, et al.
Published: (2024)
Swarm Skills: A Portable, Self-Evolving Multi-Agent System Specification for Coordination Engineering
by: Zhang, Xinyu, et al.
Published: (2026)
by: Zhang, Xinyu, et al.
Published: (2026)
GLARE: Agentic Reasoning for Legal Judgment Prediction
by: Yang, Xinyu, et al.
Published: (2025)
by: Yang, Xinyu, et al.
Published: (2025)
From Blind Guess to Informed Judgment: Teaching LLMs to Evaluate Materials by Building Knowledge-Augmented Preference Signals
by: Yu, Yeyong, et al.
Published: (2026)
by: Yu, Yeyong, et al.
Published: (2026)
Tracing How Annotators Think: Augmenting Preference Judgments with Reading Processes
by: de Langis, Karin, et al.
Published: (2025)
by: de Langis, Karin, et al.
Published: (2025)
TSO: Self-Training with Scaled Preference Optimization
by: Chen, Kaihui, et al.
Published: (2024)
by: Chen, Kaihui, et al.
Published: (2024)
Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling
by: Liu, Shuliang, et al.
Published: (2025)
by: Liu, Shuliang, et al.
Published: (2025)
Large Language Model-based Human-Agent Collaboration for Complex Task Solving
by: Feng, Xueyang, et al.
Published: (2024)
by: Feng, Xueyang, et al.
Published: (2024)
Explaining Length Bias in LLM-Based Preference Evaluations
by: Hu, Zhengyu, et al.
Published: (2024)
by: Hu, Zhengyu, et al.
Published: (2024)
Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
by: Yang, Wenkai, et al.
Published: (2025)
by: Yang, Wenkai, et al.
Published: (2025)
Exploring Backdoor Vulnerabilities of Chat Models
by: Hao, Yunzhuo, et al.
Published: (2024)
by: Hao, Yunzhuo, et al.
Published: (2024)
Multimodal Multi-Agent Empowered Legal Judgment Prediction
by: Kang, Zhaolu, et al.
Published: (2026)
by: Kang, Zhaolu, et al.
Published: (2026)
Self-Preference Bias in LLM-as-a-Judge
by: Wataoka, Koki, et al.
Published: (2024)
by: Wataoka, Koki, et al.
Published: (2024)
ImplicitRM: Unbiased Reward Modeling from Implicit Preference Data for LLM alignment
by: Wang, Hao, et al.
Published: (2026)
by: Wang, Hao, et al.
Published: (2026)
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
by: Gao, Mingqi, et al.
Published: (2024)
by: Gao, Mingqi, et al.
Published: (2024)
Quantifying and Mitigating Self-Preference Bias of LLM Judges
by: Yang, Jinming, et al.
Published: (2026)
by: Yang, Jinming, et al.
Published: (2026)
AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment
by: Xiao, Jianfei, et al.
Published: (2026)
by: Xiao, Jianfei, et al.
Published: (2026)
Self-Infilling Code Generation
by: Zheng, Lin, et al.
Published: (2023)
by: Zheng, Lin, et al.
Published: (2023)
Can You Trust LLM Judgments? Reliability of LLM-as-a-Judge
by: Schroeder, Kayla, et al.
Published: (2024)
by: Schroeder, Kayla, et al.
Published: (2024)
Self-supervised Preference Optimization: Enhance Your Language Model with Preference Degree Awareness
by: Li, Jian, et al.
Published: (2024)
by: Li, Jian, et al.
Published: (2024)
Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference
by: Gao, Mingqi, et al.
Published: (2024)
by: Gao, Mingqi, et al.
Published: (2024)
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
by: Zhang, Kaiyi, et al.
Published: (2026)
by: Zhang, Kaiyi, et al.
Published: (2026)
Improving Alignment in LVLMs with Debiased Self-Judgment
by: Yang, Sihan, et al.
Published: (2025)
by: Yang, Sihan, et al.
Published: (2025)
Judgment of Learning: A Human Ability Beyond Generative Artificial Intelligence
by: Huff, Markus, et al.
Published: (2024)
by: Huff, Markus, et al.
Published: (2024)
Benchmarking LLM-based Relevance Judgment Methods
by: Arabzadeh, Negar, et al.
Published: (2025)
by: Arabzadeh, Negar, et al.
Published: (2025)
DGPO: Beyond Pairwise Preferences with Directional Consistent Groupwise Optimization
by: Deng, Mengyi, et al.
Published: (2026)
by: Deng, Mengyi, et al.
Published: (2026)
Reliable Conversational Agents under ASP Control that Understand Natural Language
by: Zeng, Yankai
Published: (2025)
by: Zeng, Yankai
Published: (2025)
Towards Cross-lingual Values Judgment: A Consensus-Pluralism Perspective
by: Chen, Yukun, et al.
Published: (2026)
by: Chen, Yukun, et al.
Published: (2026)
LocalSUG: City-Preference-Enhanced LLM for Query Suggestion in Local-Life Services
by: Chen, Jinwen, et al.
Published: (2026)
by: Chen, Jinwen, et al.
Published: (2026)
How Many Human Judgments Are Enough? Feasibility Limits of Human Preference Evaluation
by: Lee, Wilson Y.
Published: (2026)
by: Lee, Wilson Y.
Published: (2026)
Similar Items
-
Beyond Single-Point Judgment: Distribution Alignment for LLM-as-a-Judge
by: Chen, Luyu, et al.
Published: (2025) -
Advancing LLM Reasoning Generalists with Preference Trees
by: Yuan, Lifan, et al.
Published: (2024) -
Rational Decision-Making Agent with Internalized Utility Judgment
by: Ye, Yining, et al.
Published: (2023) -
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment
by: Kim, Dongyoung, et al.
Published: (2024) -
Do LLM Evaluators Prefer Themselves for a Reason?
by: Chen, Wei-Lin, et al.
Published: (2025)