Saved in:
| Main Authors: | Zheng, Chen, Sun, Ke, Zhou, Xun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.08657 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF
by: Zheng, Chen, et al.
Published: (2024)
by: Zheng, Chen, et al.
Published: (2024)
Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models
by: Zheng, Chen, et al.
Published: (2025)
by: Zheng, Chen, et al.
Published: (2025)
C2F-Thinker: Coarse-to-Fine Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis
by: Luo, Miaosen, et al.
Published: (2026)
by: Luo, Miaosen, et al.
Published: (2026)
ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers
by: Zheng, Chen, et al.
Published: (2024)
by: Zheng, Chen, et al.
Published: (2024)
Vuyko Mistral: Adapting LLMs for Low-Resource Dialectal Translation
by: Kyslyi, Roman, et al.
Published: (2025)
by: Kyslyi, Roman, et al.
Published: (2025)
Mistral-SPLADE: LLMs for better Learned Sparse Retrieval
by: Doshi, Meet, et al.
Published: (2024)
by: Doshi, Meet, et al.
Published: (2024)
Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA
by: Zheng, Yuanlei, et al.
Published: (2026)
by: Zheng, Yuanlei, et al.
Published: (2026)
MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning
by: Chen, Justin Chih-Yao, et al.
Published: (2024)
by: Chen, Justin Chih-Yao, et al.
Published: (2024)
CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information
by: Wang, Yuxin, et al.
Published: (2024)
by: Wang, Yuxin, et al.
Published: (2024)
Taming Overconfidence in LLMs: Reward Calibration in RLHF
by: Leng, Jixuan, et al.
Published: (2024)
by: Leng, Jixuan, et al.
Published: (2024)
Reward-Robust RLHF in LLMs
by: Yan, Yuzi, et al.
Published: (2024)
by: Yan, Yuzi, et al.
Published: (2024)
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
by: Farn, Hua, et al.
Published: (2024)
by: Farn, Hua, et al.
Published: (2024)
CFMS: A Coarse-to-Fine Multimodal Synthesis Framework for Enhanced Tabular Reasoning
by: Huang, Qixian, et al.
Published: (2026)
by: Huang, Qixian, et al.
Published: (2026)
The Thinking Spectrum: An Empirical Study of Tunable Reasoning in LLMs through Model Merging
by: Lan, Xiaochong, et al.
Published: (2025)
by: Lan, Xiaochong, et al.
Published: (2025)
Joint Enhancement of Relational Reasoning for Long-Context LLMs
by: Chen, Zhirui, et al.
Published: (2025)
by: Chen, Zhirui, et al.
Published: (2025)
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
by: Ji, Jiaming, et al.
Published: (2024)
by: Ji, Jiaming, et al.
Published: (2024)
Continual SFT Matches Multimodal RLHF with Negative Supervision
by: Zhu, Ke, et al.
Published: (2024)
by: Zhu, Ke, et al.
Published: (2024)
ReasonAny: Incorporating Reasoning Capability to Any Model via Simple and Effective Model Merging
by: Yang, Junyao, et al.
Published: (2026)
by: Yang, Junyao, et al.
Published: (2026)
Linq-Embed-Mistral Technical Report
by: Choi, Chanyeol, et al.
Published: (2024)
by: Choi, Chanyeol, et al.
Published: (2024)
MolReasoner: Toward Effective and Interpretable Reasoning for Molecular LLMs
by: Zhao, Guojiang, et al.
Published: (2025)
by: Zhao, Guojiang, et al.
Published: (2025)
Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution
by: Xu, Nuo, et al.
Published: (2024)
by: Xu, Nuo, et al.
Published: (2024)
From Bytes to Borsch: Fine-Tuning Gemma and Mistral for the Ukrainian Language Representation
by: Kiulian, Artur, et al.
Published: (2024)
by: Kiulian, Artur, et al.
Published: (2024)
TinyThinker: Distilling Reasoning through Coarse-to-Fine Knowledge Internalization with Self-Reflection
by: Piao, Shengmin, et al.
Published: (2024)
by: Piao, Shengmin, et al.
Published: (2024)
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks
by: Wang, Zheng, et al.
Published: (2024)
by: Wang, Zheng, et al.
Published: (2024)
Removing RLHF Protections in GPT-4 via Fine-Tuning
by: Zhan, Qiusi, et al.
Published: (2023)
by: Zhan, Qiusi, et al.
Published: (2023)
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)
by: Yu, Tianyu, et al.
Published: (2023)
Putting on the Thinking Hats: A Survey on Chain of Thought Fine-tuning from the Perspective of Human Reasoning Mechanism
by: Chen, Xiaoshu, et al.
Published: (2025)
by: Chen, Xiaoshu, et al.
Published: (2025)
Harnessing RLHF for Robust Unanswerability Recognition and Trustworthy Response Generation in LLMs
by: Lin, Shuyuan, et al.
Published: (2025)
by: Lin, Shuyuan, et al.
Published: (2025)
RLHF Workflow: From Reward Modeling to Online RLHF
by: Dong, Hanze, et al.
Published: (2024)
by: Dong, Hanze, et al.
Published: (2024)
Towards Federated RLHF with Aggregated Client Preference for LLMs
by: Wu, Feijie, et al.
Published: (2024)
by: Wu, Feijie, et al.
Published: (2024)
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration
by: He, Bowei, et al.
Published: (2026)
by: He, Bowei, et al.
Published: (2026)
Coarse-to-Fine Personalized LLM Impressions for Streamlined Radiology Reports
by: Sun, Chengbo, et al.
Published: (2025)
by: Sun, Chengbo, et al.
Published: (2025)
Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model
by: Yin, Yueqin, et al.
Published: (2025)
by: Yin, Yueqin, et al.
Published: (2025)
Actor Identification in Discourse: A Challenge for LLMs?
by: Barić, Ana, et al.
Published: (2024)
by: Barić, Ana, et al.
Published: (2024)
Multilevel Analysis of Cryptocurrency News using RAG Approach with Fine-Tuned Mistral Large Language Model
by: Pavlyshenko, Bohdan M.
Published: (2025)
by: Pavlyshenko, Bohdan M.
Published: (2025)
Reasoning Pattern Alignment Merging for Adaptive Reasoning
by: Zhong, Zhaofeng, et al.
Published: (2026)
by: Zhong, Zhaofeng, et al.
Published: (2026)
Effective Distillation of Table-based Reasoning Ability from LLMs
by: Yang, Bohao, et al.
Published: (2023)
by: Yang, Bohao, et al.
Published: (2023)
Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-to-Coarse Attention
by: He, Ziwei, et al.
Published: (2023)
by: He, Ziwei, et al.
Published: (2023)
Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach
by: Pang, Chaoxu, et al.
Published: (2025)
by: Pang, Chaoxu, et al.
Published: (2025)
Adaptive Graph Refinement and Label Propagation with LLMs for Cost-Effective Entity Resolution
by: Wang, Hongtao, et al.
Published: (2026)
by: Wang, Hongtao, et al.
Published: (2026)
Similar Items
-
Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF
by: Zheng, Chen, et al.
Published: (2024) -
Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models
by: Zheng, Chen, et al.
Published: (2025) -
C2F-Thinker: Coarse-to-Fine Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis
by: Luo, Miaosen, et al.
Published: (2026) -
ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers
by: Zheng, Chen, et al.
Published: (2024) -
Vuyko Mistral: Adapting LLMs for Low-Resource Dialectal Translation
by: Kyslyi, Roman, et al.
Published: (2025)