Saved in:
| Main Authors: | Gu, Yanggan, Wang, Yuanyi, Yan, Zhaoyi, Zhang, Yiming, Zhou, Qi, Wu, Fei, Yang, Hongxia |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.13878 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion
by: Wang, Yuanyi, et al.
Published: (2025)
by: Wang, Yuanyi, et al.
Published: (2025)
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
by: Yan, Zhaoyi, et al.
Published: (2025)
by: Yan, Zhaoyi, et al.
Published: (2025)
Model Merging Scaling Laws in Large Language Models
by: Wang, Yuanyi, et al.
Published: (2025)
by: Wang, Yuanyi, et al.
Published: (2025)
Access Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging
by: Wang, Yuanyi, et al.
Published: (2026)
by: Wang, Yuanyi, et al.
Published: (2026)
E-PMQ: Expert-Guided Post-Merge Quantization with Merged-Weight Anchoring
by: Wang, Wenjun, et al.
Published: (2026)
by: Wang, Wenjun, et al.
Published: (2026)
Self-Play Preference Optimization for Language Model Alignment
by: Wu, Yue, et al.
Published: (2024)
by: Wu, Yue, et al.
Published: (2024)
InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
by: Xie, Congkai, et al.
Published: (2025)
by: Xie, Congkai, et al.
Published: (2025)
DavIR: Data Selection via Implicit Reward for Large Language Models
by: Zhou, Haotian, et al.
Published: (2023)
by: Zhou, Haotian, et al.
Published: (2023)
Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
by: Gu, Yanggan, et al.
Published: (2025)
by: Gu, Yanggan, et al.
Published: (2025)
Accelerated Preference Optimization for Large Language Model Alignment
by: He, Jiafan, et al.
Published: (2024)
by: He, Jiafan, et al.
Published: (2024)
InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models
by: Wang, Wenjun, et al.
Published: (2025)
by: Wang, Wenjun, et al.
Published: (2025)
Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training
by: Wang, Yuanyi, et al.
Published: (2026)
by: Wang, Yuanyi, et al.
Published: (2026)
FeatCal: Feature Calibration for Post-Merging Models
by: Gu, Yanggan, et al.
Published: (2026)
by: Gu, Yanggan, et al.
Published: (2026)
Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models
by: Liu, Zeyu, et al.
Published: (2025)
by: Liu, Zeyu, et al.
Published: (2025)
MergePipe: A Budget-Aware Parameter Management System for Scalable LLM Merging
by: Wang, Yuanyi, et al.
Published: (2026)
by: Wang, Yuanyi, et al.
Published: (2026)
Weighted-Reward Preference Optimization for Implicit Model Fusion
by: Yang, Ziyi, et al.
Published: (2024)
by: Yang, Ziyi, et al.
Published: (2024)
ROPO: Robust Preference Optimization for Large Language Models
by: Liang, Xize, et al.
Published: (2024)
by: Liang, Xize, et al.
Published: (2024)
InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models
by: Li, Linyi, et al.
Published: (2024)
by: Li, Linyi, et al.
Published: (2024)
InfiCoEvalChain: A Blockchain-Based Decentralized Framework for Collaborative LLM Evaluation
by: Yang, Yifan, et al.
Published: (2026)
by: Yang, Yifan, et al.
Published: (2026)
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
Multi-Reference Preference Optimization for Large Language Models
by: Le, Hung, et al.
Published: (2024)
by: Le, Hung, et al.
Published: (2024)
FedPDPO: Federated Personalized Direct Preference Optimization for Large Language Model Alignment
by: Zhu, Kewen, et al.
Published: (2026)
by: Zhu, Kewen, et al.
Published: (2026)
InfiMed-ORBIT: Aligning LLMs on Open-Ended Complex Tasks via Rubric-Based Incremental Training
by: Wang, Pengkai, et al.
Published: (2025)
by: Wang, Pengkai, et al.
Published: (2025)
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization
by: Liu, Yuhang, et al.
Published: (2025)
by: Liu, Yuhang, et al.
Published: (2025)
Reasoning Factual Knowledge in Structured Data with Large Language Models
by: Huang, Sirui, et al.
Published: (2024)
by: Huang, Sirui, et al.
Published: (2024)
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
by: Lin, Yong, et al.
Published: (2024)
by: Lin, Yong, et al.
Published: (2024)
InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners
by: Liu, Yuhang, et al.
Published: (2025)
by: Liu, Yuhang, et al.
Published: (2025)
ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference
by: Cai, Tianchi, et al.
Published: (2023)
by: Cai, Tianchi, et al.
Published: (2023)
Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models
by: Mekala, Anmol, et al.
Published: (2024)
by: Mekala, Anmol, et al.
Published: (2024)
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
by: Liu, Yuhang, et al.
Published: (2025)
by: Liu, Yuhang, et al.
Published: (2025)
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
by: Hu, Xueyu, et al.
Published: (2024)
by: Hu, Xueyu, et al.
Published: (2024)
LifeAlign: Lifelong Alignment for Large Language Models with Memory-Augmented Focalized Preference Optimization
by: Li, Junsong, et al.
Published: (2025)
by: Li, Junsong, et al.
Published: (2025)
InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning
by: Liu, Zeyu, et al.
Published: (2025)
by: Liu, Zeyu, et al.
Published: (2025)
TWIN-GPT: Digital Twins for Clinical Trials via Large Language Model
by: Wang, Yue, et al.
Published: (2024)
by: Wang, Yue, et al.
Published: (2024)
Optimizing RLHF Training for Large Language Models with Stage Fusion
by: Zhong, Yinmin, et al.
Published: (2024)
by: Zhong, Yinmin, et al.
Published: (2024)
Optimizing Temperature for Language Models with Multi-Sample Inference
by: Du, Weihua, et al.
Published: (2025)
by: Du, Weihua, et al.
Published: (2025)
Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment
by: Zhang, Yifan, et al.
Published: (2024)
by: Zhang, Yifan, et al.
Published: (2024)
Large Language Models as Optimizers
by: Yang, Chengrun, et al.
Published: (2023)
by: Yang, Chengrun, et al.
Published: (2023)
Discovering Implicit Large Language Model Alignment Objectives
by: Chen, Edward, et al.
Published: (2026)
by: Chen, Edward, et al.
Published: (2026)
Group Preference Optimization: Few-Shot Alignment of Large Language Models
by: Zhao, Siyan, et al.
Published: (2023)
by: Zhao, Siyan, et al.
Published: (2023)
Similar Items
-
InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion
by: Wang, Yuanyi, et al.
Published: (2025) -
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
by: Yan, Zhaoyi, et al.
Published: (2025) -
Model Merging Scaling Laws in Large Language Models
by: Wang, Yuanyi, et al.
Published: (2025) -
Access Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging
by: Wang, Yuanyi, et al.
Published: (2026) -
E-PMQ: Expert-Guided Post-Merge Quantization with Merged-Weight Anchoring
by: Wang, Wenjun, et al.
Published: (2026)