:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zheng, Chen, Sun, Ke, Zhou, Xun
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.08657
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF
by: Zheng, Chen, et al.
Published: (2024)

Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models
by: Zheng, Chen, et al.
Published: (2025)

C2F-Thinker: Coarse-to-Fine Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis
by: Luo, Miaosen, et al.
Published: (2026)

ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers
by: Zheng, Chen, et al.
Published: (2024)

Vuyko Mistral: Adapting LLMs for Low-Resource Dialectal Translation
by: Kyslyi, Roman, et al.
Published: (2025)

Mistral-SPLADE: LLMs for better Learned Sparse Retrieval
by: Doshi, Meet, et al.
Published: (2024)

Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA
by: Zheng, Yuanlei, et al.
Published: (2026)

MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
by: Chen, Justin Chih-Yao, et al.
Published: (2024)

CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information
by: Wang, Yuxin, et al.
Published: (2024)

Taming Overconfidence in LLMs: Reward Calibration in RLHF
by: Leng, Jixuan, et al.
Published: (2024)

Reward-Robust RLHF in LLMs
by: Yan, Yuzi, et al.
Published: (2024)

Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
by: Farn, Hua, et al.
Published: (2024)

CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning
by: Huang, Qixian, et al.
Published: (2026)

The Thinking Spectrum: An Empirical Study of Tunable Reasoning in LLMs through Model Merging
by: Lan, Xiaochong, et al.
Published: (2025)

Joint Enhancement of Relational Reasoning for Long-Context LLMs
by: Chen, Zhirui, et al.
Published: (2025)

PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
by: Ji, Jiaming, et al.
Published: (2024)

Continual SFT Matches Multimodal RLHF with Negative Supervision
by: Zhu, Ke, et al.
Published: (2024)

ReasonAny: Incorporating Reasoning Capability to Any Model via Simple and Effective Model Merging
by: Yang, Junyao, et al.
Published: (2026)

Linq-Embed-Mistral Technical Report
by: Choi, Chanyeol, et al.
Published: (2024)

MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
by: Xu, Nuo, et al.
Published: (2024)

From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation
by: Kiulian, Artur, et al.
Published: (2024)

TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection
by: Piao, Shengmin, et al.
Published: (2024)

Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks
by: Wang, Zheng, et al.
Published: (2024)

Removing RLHF Protections in GPT-4 via Fine-Tuning
by: Zhan, Qiusi, et al.
Published: (2023)

RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)

Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism
by: Chen, Xiaoshu, et al.
Published: (2025)

Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs
by: Lin, Shuyuan, et al.
Published: (2025)

RLHF Workflow: From Reward Modeling to Online RLHF
by: Dong, Hanze, et al.
Published: (2024)

Towards Federated RLHF with Aggregated Client Preference for LLMs
by: Wu, Feijie, et al.
Published: (2024)

Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration
by: He, Bowei, et al.
Published: (2026)

Coarse-to-Fine Personalized LLM Impressions for Streamlined Radiology Reports
by: Sun, Chengbo, et al.
Published: (2025)

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
by: Yin, Yueqin, et al.
Published: (2025)

Actor Identification in Discourse: A Challenge for LLMs?
by: Barić, Ana, et al.
Published: (2024)

Multilevel Analysis of Cryptocurrency News using RAG Approach with Fine-Tuned Mistral Large Language Model
by: Pavlyshenko, Bohdan M.
Published: (2025)

Reasoning Pattern Alignment Merging for Adaptive Reasoning
by: Zhong, Zhaofeng, et al.
Published: (2026)

Effective Distillation of Table-based Reasoning Ability from LLMs
by: Yang, Bohao, et al.
Published: (2023)

Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention
by: He, Ziwei, et al.
Published: (2023)

Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach
by: Pang, Chaoxu, et al.
Published: (2025)

Adaptive Graph Refinement and Label Propagation with LLMs for Cost-Effective Entity Resolution
by: Wang, Hongtao, et al.
Published: (2026)