Saved in:
| Main Authors: | Wang, Yikun, Zheng, Rui, Li, Haoming, Zhang, Qi, Gui, Tao, Liu, Fei |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2311.09136 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving RL Exploration for LLM Reasoning through Retrospective Replay
by: Dou, Shihan, et al.
Published: (2025)
by: Dou, Shihan, et al.
Published: (2025)
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
by: Ye, Junjie, et al.
Published: (2024)
by: Ye, Junjie, et al.
Published: (2024)
Uncertainty Aware Learning for Language Model Alignment
by: Wang, Yikun, et al.
Published: (2024)
by: Wang, Yikun, et al.
Published: (2024)
LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning
by: Wang, Rongsheng, et al.
Published: (2024)
by: Wang, Rongsheng, et al.
Published: (2024)
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
by: Liu, Zijun, et al.
Published: (2024)
by: Liu, Zijun, et al.
Published: (2024)
Personalized LLM Response Generation with Parameterized Memory Injection
by: Zhang, Kai, et al.
Published: (2024)
by: Zhang, Kai, et al.
Published: (2024)
Systematic Analysis of LLM Contributions to Planning: Solver, Verifier, Heuristic
by: Li, Haoming, et al.
Published: (2024)
by: Li, Haoming, et al.
Published: (2024)
SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
by: Huang, Caishuang, et al.
Published: (2024)
by: Huang, Caishuang, et al.
Published: (2024)
RevisEval: Improving LLM-as-a-Judge via Response-Adapted References
by: Zhang, Qiyuan, et al.
Published: (2024)
by: Zhang, Qiyuan, et al.
Published: (2024)
Improving Similar Case Retrieval Ranking Performance By Revisiting RankSVM
by: Liu, Yuqi, et al.
Published: (2025)
by: Liu, Yuqi, et al.
Published: (2025)
Evaluating LLMs at Detecting Errors in LLM Responses
by: Kamoi, Ryo, et al.
Published: (2024)
by: Kamoi, Ryo, et al.
Published: (2024)
TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness
by: Zheng, Danna, et al.
Published: (2024)
by: Zheng, Danna, et al.
Published: (2024)
TOOL-ED: Enhancing Empathetic Response Generation with the Tool Calling Capability of LLM
by: Cao, Huiying, et al.
Published: (2024)
by: Cao, Huiying, et al.
Published: (2024)
Creative Beam Search: LLM-as-a-Judge For Improving Response Generation
by: Franceschelli, Giorgio, et al.
Published: (2024)
by: Franceschelli, Giorgio, et al.
Published: (2024)
Toward Optimal LLM Alignments Using Two-Player Games
by: Zheng, Rui, et al.
Published: (2024)
by: Zheng, Rui, et al.
Published: (2024)
Conversational User-AI Intervention: A Study on Prompt Rewriting for Improved LLM Response Generation
by: Sarkar, Rupak, et al.
Published: (2025)
by: Sarkar, Rupak, et al.
Published: (2025)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
by: Dou, Shihan, et al.
Published: (2024)
by: Dou, Shihan, et al.
Published: (2024)
Modeling Layout Reading Order as Ordering Relations for Visually-rich Document Understanding
by: Zhang, Chong, et al.
Published: (2024)
by: Zhang, Chong, et al.
Published: (2024)
Citations and Trust in LLM Generated Responses
by: Ding, Yifan, et al.
Published: (2025)
by: Ding, Yifan, et al.
Published: (2025)
When to Trust LLMs: Aligning Confidence with Response Quality
by: Tao, Shuchang, et al.
Published: (2024)
by: Tao, Shuchang, et al.
Published: (2024)
LASP: Surveying the State-of-the-Art in Large Language Model-Assisted AI Planning
by: Li, Haoming, et al.
Published: (2024)
by: Li, Haoming, et al.
Published: (2024)
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
by: Zhou, Enyu, et al.
Published: (2024)
by: Zhou, Enyu, et al.
Published: (2024)
LADR: Locality-Aware Dynamic Rescue for Efficient Text-to-Image Generation with Diffusion Large Language Models
by: Wang, Chenglin, et al.
Published: (2026)
by: Wang, Chenglin, et al.
Published: (2026)
Ad Insertion in LLM-Generated Responses
by: Xu, Shengwei, et al.
Published: (2026)
by: Xu, Shengwei, et al.
Published: (2026)
Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese
by: Wang, Haochun, et al.
Published: (2023)
by: Wang, Haochun, et al.
Published: (2023)
Length Generalization of Causal Transformers without Position Encoding
by: Wang, Jie, et al.
Published: (2024)
by: Wang, Jie, et al.
Published: (2024)
Domain Generalization via Causal Adjustment for Cross-Domain Sentiment Analysis
by: Wang, Siyin, et al.
Published: (2024)
by: Wang, Siyin, et al.
Published: (2024)
Monotonic Paraphrasing Improves Generalization of Language Model Prompting
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation
by: Huang, Zhiqi, et al.
Published: (2025)
by: Huang, Zhiqi, et al.
Published: (2025)
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals
by: Zheng, Rui, et al.
Published: (2024)
by: Zheng, Rui, et al.
Published: (2024)
FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation
by: Oh, Juhyun, et al.
Published: (2026)
by: Oh, Juhyun, et al.
Published: (2026)
Generate Logical Equivalence Questions
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
PLANET: A Collection of Benchmarks for Evaluating LLMs' Planning Capabilities
by: Li, Haoming, et al.
Published: (2025)
by: Li, Haoming, et al.
Published: (2025)
FRIDA to the Rescue! Analyzing Synthetic Data Effectiveness in Object-Based Common Sense Reasoning for Disaster Response
by: Shichman, Mollie, et al.
Published: (2025)
by: Shichman, Mollie, et al.
Published: (2025)
Probing then Editing Response Personality of Large Language Models
by: Ju, Tianjie, et al.
Published: (2025)
by: Ju, Tianjie, et al.
Published: (2025)
Logical Consistency as a Bridge: Improving LLM Hallucination Detection via Label Constraint Modeling between Responses and Self-Judgments
by: Mi, Hao, et al.
Published: (2026)
by: Mi, Hao, et al.
Published: (2026)
Learning from Response not Preference: A Stackelberg Approach for LLM Detoxification using Non-parallel Data
by: Xie, Xinhong, et al.
Published: (2024)
by: Xie, Xinhong, et al.
Published: (2024)
Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation
by: Chang, Yurui, et al.
Published: (2025)
by: Chang, Yurui, et al.
Published: (2025)
Evaluate What You Can't Evaluate: Unassessable Quality for Generated Response
by: Liu, Yongkang, et al.
Published: (2023)
by: Liu, Yongkang, et al.
Published: (2023)
Multi-Response Preference Optimization with Augmented Ranking Dataset
by: Gwon, Hansle, et al.
Published: (2024)
by: Gwon, Hansle, et al.
Published: (2024)
Similar Items
-
Improving RL Exploration for LLM Reasoning through Retrospective Replay
by: Dou, Shihan, et al.
Published: (2025) -
LLM-DA: Data Augmentation via Large Language Models for Few-Shot Named Entity Recognition
by: Ye, Junjie, et al.
Published: (2024) -
Uncertainty Aware Learning for Language Model Alignment
by: Wang, Yikun, et al.
Published: (2024) -
LLM-Detector: Improving AI-Generated Chinese Text Detection with Open-Source LLM Instruction Tuning
by: Wang, Rongsheng, et al.
Published: (2024) -
Enabling Weak LLMs to Judge Response Reliability via Meta Ranking
by: Liu, Zijun, et al.
Published: (2024)