Guardado en:
| Autores principales: | Chen, Ruijun, Liang, Jiehao, Gao, Shiping, Wan, Fanqi, Quan, Xiaojun |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2406.10813 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
BlockPruner: Fine-grained Pruning for Large Language Models
por: Zhong, Longguang, et al.
Publicado: (2024)
por: Zhong, Longguang, et al.
Publicado: (2024)
Discriminative Policy Optimization for Token-Level Reward Models
por: Chen, Hongzhan, et al.
Publicado: (2025)
por: Chen, Hongzhan, et al.
Publicado: (2025)
Advantage-Guided Distillation for Preference Alignment in Small Language Models
por: Gao, Shiping, et al.
Publicado: (2025)
por: Gao, Shiping, et al.
Publicado: (2025)
FuseChat: Knowledge Fusion of Chat Models
por: Wan, Fanqi, et al.
Publicado: (2024)
por: Wan, Fanqi, et al.
Publicado: (2024)
SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models
por: Yang, Ziyi, et al.
Publicado: (2025)
por: Yang, Ziyi, et al.
Publicado: (2025)
Stabilizing Policy Optimization via Logits Convexity
por: Chen, Hongzhan, et al.
Publicado: (2026)
por: Chen, Hongzhan, et al.
Publicado: (2026)
FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion
por: Zhong, Longguang, et al.
Publicado: (2025)
por: Zhong, Longguang, et al.
Publicado: (2025)
Weighted-Reward Preference Optimization for Implicit Model Fusion
por: Yang, Ziyi, et al.
Publicado: (2024)
por: Yang, Ziyi, et al.
Publicado: (2024)
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
por: Yang, Ziyi, et al.
Publicado: (2025)
por: Yang, Ziyi, et al.
Publicado: (2025)
Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization
por: Gao, Shiping, et al.
Publicado: (2026)
por: Gao, Shiping, et al.
Publicado: (2026)
ProactiveEval: A Unified Evaluation Framework for Proactive Dialogue Agents
por: Liu, Tianjian, et al.
Publicado: (2025)
por: Liu, Tianjian, et al.
Publicado: (2025)
Knowledge Fusion of Chat LLMs: A Preliminary Technical Report
por: Wan, Fanqi, et al.
Publicado: (2024)
por: Wan, Fanqi, et al.
Publicado: (2024)
Knowledge Fusion of Large Language Models
por: Wan, Fanqi, et al.
Publicado: (2024)
por: Wan, Fanqi, et al.
Publicado: (2024)
Knowledge Verification to Nip Hallucination in the Bud
por: Wan, Fanqi, et al.
Publicado: (2024)
por: Wan, Fanqi, et al.
Publicado: (2024)
Lookahead Routing for Large Language Models
por: Huang, Canbin, et al.
Publicado: (2025)
por: Huang, Canbin, et al.
Publicado: (2025)
Knowledge Distillation of Black-Box Large Language Models
por: Chen, Hongzhan, et al.
Publicado: (2024)
por: Chen, Hongzhan, et al.
Publicado: (2024)
ProFuser: Progressive Fusion of Large Language Models
por: Shi, Tianyuan, et al.
Publicado: (2024)
por: Shi, Tianyuan, et al.
Publicado: (2024)
Agentic Policy Optimization via Instruction-Policy Co-Evolution
por: Zhou, Han, et al.
Publicado: (2025)
por: Zhou, Han, et al.
Publicado: (2025)
KBE-DME: Dynamic Multimodal Evaluation via Knowledge Enhanced Benchmark Evolution
por: Zhang, Junzhe, et al.
Publicado: (2025)
por: Zhang, Junzhe, et al.
Publicado: (2025)
ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions
por: Zhang, Xu, et al.
Publicado: (2024)
por: Zhang, Xu, et al.
Publicado: (2024)
Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models
por: Chen, Xiang, et al.
Publicado: (2023)
por: Chen, Xiang, et al.
Publicado: (2023)
DrugImproverGPT: A Large Language Model for Drug Optimization with Fine-Tuning via Structured Policy Optimization
por: Liu, Xuefeng, et al.
Publicado: (2025)
por: Liu, Xuefeng, et al.
Publicado: (2025)
Parameter-Efficient Fine-Tuning with Discrete Fourier Transform
por: Gao, Ziqi, et al.
Publicado: (2024)
por: Gao, Ziqi, et al.
Publicado: (2024)
Zero-Shot Cross-Domain Code Search without Fine-Tuning
por: Liang, Keyu, et al.
Publicado: (2025)
por: Liang, Keyu, et al.
Publicado: (2025)
ThinkSwitcher: When to Think Hard, When to Think Fast
por: Liang, Guosheng, et al.
Publicado: (2025)
por: Liang, Guosheng, et al.
Publicado: (2025)
Beyond Fine-Tuning: In-Context Learning and Chain-of-Thought for Reasoned Distractor Generation
por: Alhazmi, Elaf, et al.
Publicado: (2026)
por: Alhazmi, Elaf, et al.
Publicado: (2026)
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
por: Yang, Zhaorui, et al.
Publicado: (2024)
por: Yang, Zhaorui, et al.
Publicado: (2024)
Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models
por: Li, Jiatao, et al.
Publicado: (2024)
por: Li, Jiatao, et al.
Publicado: (2024)
Co-Evolution of Policy and Internal Reward for Language Agents
por: Wang, Xinyu, et al.
Publicado: (2026)
por: Wang, Xinyu, et al.
Publicado: (2026)
TPP-LLM: Modeling Temporal Point Processes by Efficiently Fine-Tuning Large Language Models
por: Liu, Zefang, et al.
Publicado: (2024)
por: Liu, Zefang, et al.
Publicado: (2024)
Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
por: Li, Bozhou, et al.
Publicado: (2024)
por: Li, Bozhou, et al.
Publicado: (2024)
Training Prompt Matters: State-Adaptive Optimization for Robust Fine-Tuning
por: Shi, Wenhang, et al.
Publicado: (2026)
por: Shi, Wenhang, et al.
Publicado: (2026)
Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector
por: Yang, Haihui, et al.
Publicado: (2024)
por: Yang, Haihui, et al.
Publicado: (2024)
Optimizing Soft Prompt Tuning via Structural Evolution
por: Huang, Zhenzhen, et al.
Publicado: (2026)
por: Huang, Zhenzhen, et al.
Publicado: (2026)
Analyzing and Evaluating Correlation Measures in NLG Meta-Evaluation
por: Gao, Mingqi, et al.
Publicado: (2024)
por: Gao, Mingqi, et al.
Publicado: (2024)
Exploring the Multilingual NLG Evaluation Abilities of LLM-Based Evaluators
por: Chang, Jiayi, et al.
Publicado: (2025)
por: Chang, Jiayi, et al.
Publicado: (2025)
Large Language Models as Topological Structure Enhancers for Text-Attributed Graphs
por: Sun, Shengyin, et al.
Publicado: (2023)
por: Sun, Shengyin, et al.
Publicado: (2023)
Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection
por: Li, Jiatao, et al.
Publicado: (2025)
por: Li, Jiatao, et al.
Publicado: (2025)
One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient
por: Ming, Rui, et al.
Publicado: (2025)
por: Ming, Rui, et al.
Publicado: (2025)
Using LLMs for Automated Privacy Policy Analysis: Prompt Engineering, Fine-Tuning and Explainability
por: Chen, Yuxin, et al.
Publicado: (2025)
por: Chen, Yuxin, et al.
Publicado: (2025)
Ejemplares similares
-
BlockPruner: Fine-grained Pruning for Large Language Models
por: Zhong, Longguang, et al.
Publicado: (2024) -
Discriminative Policy Optimization for Token-Level Reward Models
por: Chen, Hongzhan, et al.
Publicado: (2025) -
Advantage-Guided Distillation for Preference Alignment in Small Language Models
por: Gao, Shiping, et al.
Publicado: (2025) -
FuseChat: Knowledge Fusion of Chat Models
por: Wan, Fanqi, et al.
Publicado: (2024) -
SPELL: Self-Play Reinforcement Learning for Evolving Long-Context Language Models
por: Yang, Ziyi, et al.
Publicado: (2025)