:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Guo, Wenya, Zhang, Zhengkun, Liu, Xumeng, Zhang, Ying, Lu, Ziyu, Zhu, Haoze, Liu, Xubo, Yan, Ruxue
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2505.12754
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

From "What" to "How": Constrained Reasoning for Autoregressive Image Generation
by: Yan, Ruxue, et al.
Published: (2026)

D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
by: Zhang, Jia, et al.
Published: (2025)

On Representation Redundancy in Large-Scale Instruction Tuning Data Selection
by: Shu, Youwei, et al.
Published: (2026)

Selective Prompting Tuning for Personalized Conversations with LLMs
by: Huang, Qiushi, et al.
Published: (2024)

Data Selection for LLM Alignment Using Fine-Grained Preferences
by: Zhang, Jia, et al.
Published: (2025)

Federated Continual Instruction Tuning
by: Guo, Haiyang, et al.
Published: (2025)

SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
by: Liu, Liangxin, et al.
Published: (2024)

PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning
by: Lin, Qihao, et al.
Published: (2026)

Cross-Fusion Distance: A Novel Metric for Measuring Fusion and Separability Between Data Groups in Representation Space
by: Zhang, Xiaolong, et al.
Published: (2026)

LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning
by: Lin, Xiaotian, et al.
Published: (2025)

TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
by: Zhang, Jipeng, et al.
Published: (2024)

Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations
by: Ma, Da, et al.
Published: (2025)

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
by: Fu, Yanjun, et al.
Published: (2025)

ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
by: Wu, Yang, et al.
Published: (2024)

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
by: Li, Ming, et al.
Published: (2024)

Federated Learning of Dynamic Bayesian Network via Continuous Optimization from Time Series Data
by: Chen, Jianhong, et al.
Published: (2024)

Instruction Mining: Instruction Data Selection for Tuning Large Language Models
by: Cao, Yihan, et al.
Published: (2023)

LESS: Selecting Influential Data for Targeted Instruction Tuning
by: Xia, Mengzhou, et al.
Published: (2024)

Diversity Measurement and Subset Selection for Instruction Tuning Datasets
by: Wang, Peiqi, et al.
Published: (2024)

InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning
by: Zhang, Bo-Wen, et al.
Published: (2024)

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
by: Liu, Wei, et al.
Published: (2023)

Harnessing LLMs Explanations to Boost Surrogate Models in Tabular Data Classification
by: Shi, Ruxue, et al.
Published: (2025)

Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks
by: Wang, Shuzhan, et al.
Published: (2024)

The Best Instruction-Tuning Data are Those That Fit
by: Zhang, Dylan, et al.
Published: (2025)

Data Diversity Matters for Robust Instruction Tuning
by: Bukharin, Alexander, et al.
Published: (2023)

Destroy and Repair Using Hyper Graphs for Routing
by: Li, Ke, et al.
Published: (2025)

Guidelines for Augmentation Selection in Contrastive Learning for Time Series Classification
by: Liu, Ziyu, et al.
Published: (2024)

RPO: Fine-Tuning Visual Generative Models via Rich Vision-Language Preferences
by: Zhao, Hanyang, et al.
Published: (2025)

OPTune: Efficient Online Preference Tuning
by: Chen, Lichang, et al.
Published: (2024)

Feature Matching Intervention: Leveraging Observational Data for Causal Representation Learning
by: Li, Haoze, et al.
Published: (2025)

HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
by: Guo, Haiyang, et al.
Published: (2025)

Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities
by: Dai, Qirun, et al.
Published: (2025)

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
by: Guo, Siyuan, et al.
Published: (2024)

3DS: Medical Domain Adaptation of LLMs via Decomposed Difficulty-based Data Selection
by: Ding, Hongxin, et al.
Published: (2024)

Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning
by: Yuan, Zhihang, et al.
Published: (2026)

Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery
by: Qu, Junqi, et al.
Published: (2025)

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
by: Safaei, Bardia, et al.
Published: (2025)

A Comprehensive Survey of Synthetic Tabular Data Generation
by: Shi, Ruxue, et al.
Published: (2025)

Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection
by: Maharana, Adyasha, et al.
Published: (2024)

Graph-Guided Concept Selection for Efficient Retrieval-Augmented Generation
by: Liu, Ziyu, et al.
Published: (2025)