:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Li, Yisen, Yang, Lingfeng, Shen, Wenxuan, Zhou, Pan, Wan, Yao, Lin, Weiwei, Chen, Dongping
Format:	Preprint
Publié:	2025
Sujets:	Computation and Language Artificial Intelligence
Accès en ligne:	https://arxiv.org/abs/2503.01836
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
par: Liu, Shang, et autres
Publié: (2024)

MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures
par: Ni, Jinjie, et autres
Publié: (2024)

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
par: Li, Ming, et autres
Publié: (2024)

Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy
par: Schoenegger, Philipp, et autres
Publié: (2024)

Are We on the Right Way to Assessing LLM-as-a-Judge?
par: Feng, Yuanning, et autres
Publié: (2025)

GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
par: Yang, Ningyuan, et autres
Publié: (2026)

Self-Cognition in Large Language Models: An Exploratory Study
par: Chen, Dongping, et autres
Publié: (2024)

Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning
par: Yang, Cehao, et autres
Publié: (2025)

Data Selection for Multi-turn Dialogue Instruction Tuning
par: Li, Bo, et autres
Publié: (2026)

SelectLLM: Can LLMs Select Important Instructions to Annotate?
par: Parkar, Ritik Sachin, et autres
Publié: (2024)

ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
par: Wu, Yang, et autres
Publié: (2024)

Wisdom from Diversity: Bias Mitigation Through Hybrid Human-LLM Crowds
par: Abels, Axel, et autres
Publié: (2025)

On the Step Length Confounding in LLM Reasoning Data Selection
par: Wang, Bing, et autres
Publié: (2026)

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
par: Chen, Dongping, et autres
Publié: (2024)

The Impact of Large Language Models in Academia: from Writing to Speaking
par: Geng, Mingmeng, et autres
Publié: (2024)

LESS: Selecting Influential Data for Targeted Instruction Tuning
par: Xia, Mengzhou, et autres
Publié: (2024)

HonestLLM: Toward an Honest and Helpful Large Language Model
par: Gao, Chujie, et autres
Publié: (2024)

TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
par: Zhang, Jipeng, et autres
Publié: (2024)

SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
par: Shen, Han, et autres
Publié: (2024)

MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
par: Chen, Yicheng, et autres
Publié: (2025)

Code Execution as Grounded Supervision for LLM Reasoning
par: Jung, Dongwon, et autres
Publié: (2025)

Instruction Mining: Instruction Data Selection for Tuning Large Language Models
par: Cao, Yihan, et autres
Publié: (2023)

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay
par: Sun, Yifan, et autres
Publié: (2025)

Boosting LLM via Learning from Data Iteratively and Selectively
par: Jia, Qi, et autres
Publié: (2024)

Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning
par: Pang, Jinlong, et autres
Publié: (2025)

On the Multi-turn Instruction Following for Conversational Web Agents
par: Deng, Yang, et autres
Publié: (2024)

Rethinking Data Selection at Scale: Random Selection is Almost All You Need
par: Xia, Tingyu, et autres
Publié: (2024)

Human-Instruction-Free LLM Self-Alignment with Limited Samples
par: Guo, Hongyi, et autres
Publié: (2024)

Generalizable End-to-End Tool-Use RL with Synthetic CodeGym
par: Du, Weihua, et autres
Publié: (2025)

Token-level Data Selection for Safe LLM Fine-tuning
par: Li, Yanping, et autres
Publié: (2026)

TACOS: Open Tagging and Comparative Scoring for Instruction Fine-Tuning Data Selection
par: He, Xixiang, et autres
Publié: (2025)

Wikipedia in the Era of LLMs: Evolution and Risks
par: Huang, Siming, et autres
Publié: (2025)

DataShield: Safety-degrading Data Filtering for LLM Benign Instruction Fine-Tuning
par: Zhang, Junbo, et autres
Publié: (2026)

Infinity Instruct: Scaling Instruction Selection and Synthesis to Enhance Language Models
par: Li, Jijie, et autres
Publié: (2025)

Routing with Generated Data: Annotation-Free LLM Skill Estimation and Expert Selection
par: Niu, Tianyi, et autres
Publié: (2026)

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning
par: Zhou, Hang, et autres
Publié: (2024)

Optimizing Length Compression in Large Reasoning Models
par: Cheng, Zhengxiang, et autres
Publié: (2025)

Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning
par: Yuan, Zhihang, et autres
Publié: (2026)

CODESYNC: Synchronizing Large Language Models with Dynamic Code Evolution at Scale
par: Wang, Chenlong, et autres
Publié: (2025)

Chasing Random: Instruction Selection Strategies Fail to Generalize
par: Diddee, Harshita, et autres
Publié: (2024)