Saved in:
| Main Authors: | Guo, Wenya, Zhang, Zhengkun, Liu, Xumeng, Zhang, Ying, Lu, Ziyu, Zhu, Haoze, Liu, Xubo, Yan, Ruxue |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.12754 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From "What" to "How": Constrained Reasoning for Autoregressive Image Generation
by: Yan, Ruxue, et al.
Published: (2026)
by: Yan, Ruxue, et al.
Published: (2026)
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
by: Zhang, Jia, et al.
Published: (2025)
by: Zhang, Jia, et al.
Published: (2025)
On Representation Redundancy in Large-Scale Instruction Tuning Data Selection
by: Shu, Youwei, et al.
Published: (2026)
by: Shu, Youwei, et al.
Published: (2026)
Selective Prompting Tuning for Personalized Conversations with LLMs
by: Huang, Qiushi, et al.
Published: (2024)
by: Huang, Qiushi, et al.
Published: (2024)
Data Selection for LLM Alignment Using Fine-Grained Preferences
by: Zhang, Jia, et al.
Published: (2025)
by: Zhang, Jia, et al.
Published: (2025)
Federated Continual Instruction Tuning
by: Guo, Haiyang, et al.
Published: (2025)
by: Guo, Haiyang, et al.
Published: (2025)
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
by: Liu, Liangxin, et al.
Published: (2024)
by: Liu, Liangxin, et al.
Published: (2024)
PRISM: Preference-Aware Influence Function Based Data Selection Method for Efficient Fine-Tuning
by: Lin, Qihao, et al.
Published: (2026)
by: Lin, Qihao, et al.
Published: (2026)
Cross-Fusion Distance: A Novel Metric for Measuring Fusion and Separability Between Data Groups in Representation Space
by: Zhang, Xiaolong, et al.
Published: (2026)
by: Zhang, Xiaolong, et al.
Published: (2026)
LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning
by: Lin, Xiaotian, et al.
Published: (2025)
by: Lin, Xiaotian, et al.
Published: (2025)
TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data
by: Zhang, Jipeng, et al.
Published: (2024)
by: Zhang, Jipeng, et al.
Published: (2024)
Task-Specific Data Selection for Instruction Tuning via Monosemantic Neuronal Activations
by: Ma, Da, et al.
Published: (2025)
by: Ma, Da, et al.
Published: (2025)
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
by: Fu, Yanjun, et al.
Published: (2025)
by: Fu, Yanjun, et al.
Published: (2025)
ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
by: Wu, Yang, et al.
Published: (2024)
by: Wu, Yang, et al.
Published: (2024)
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
by: Li, Ming, et al.
Published: (2024)
by: Li, Ming, et al.
Published: (2024)
Federated Learning of Dynamic Bayesian Network via Continuous Optimization from Time Series Data
by: Chen, Jianhong, et al.
Published: (2024)
by: Chen, Jianhong, et al.
Published: (2024)
Instruction Mining: Instruction Data Selection for Tuning Large Language Models
by: Cao, Yihan, et al.
Published: (2023)
by: Cao, Yihan, et al.
Published: (2023)
LESS: Selecting Influential Data for Targeted Instruction Tuning
by: Xia, Mengzhou, et al.
Published: (2024)
by: Xia, Mengzhou, et al.
Published: (2024)
Diversity Measurement and Subset Selection for Instruction Tuning Datasets
by: Wang, Peiqi, et al.
Published: (2024)
by: Wang, Peiqi, et al.
Published: (2024)
InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning
by: Zhang, Bo-Wen, et al.
Published: (2024)
by: Zhang, Bo-Wen, et al.
Published: (2024)
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
by: Liu, Wei, et al.
Published: (2023)
by: Liu, Wei, et al.
Published: (2023)
Harnessing LLMs Explanations to Boost Surrogate Models in Tabular Data Classification
by: Shi, Ruxue, et al.
Published: (2025)
by: Shi, Ruxue, et al.
Published: (2025)
Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks
by: Wang, Shuzhan, et al.
Published: (2024)
by: Wang, Shuzhan, et al.
Published: (2024)
The Best Instruction-Tuning Data are Those That Fit
by: Zhang, Dylan, et al.
Published: (2025)
by: Zhang, Dylan, et al.
Published: (2025)
Data Diversity Matters for Robust Instruction Tuning
by: Bukharin, Alexander, et al.
Published: (2023)
by: Bukharin, Alexander, et al.
Published: (2023)
Destroy and Repair Using Hyper Graphs for Routing
by: Li, Ke, et al.
Published: (2025)
by: Li, Ke, et al.
Published: (2025)
Guidelines for Augmentation Selection in Contrastive Learning for Time Series Classification
by: Liu, Ziyu, et al.
Published: (2024)
by: Liu, Ziyu, et al.
Published: (2024)
RPO: Fine-Tuning Visual Generative Models via Rich Vision-Language Preferences
by: Zhao, Hanyang, et al.
Published: (2025)
by: Zhao, Hanyang, et al.
Published: (2025)
OPTune: Efficient Online Preference Tuning
by: Chen, Lichang, et al.
Published: (2024)
by: Chen, Lichang, et al.
Published: (2024)
Feature Matching Intervention: Leveraging Observational Data for Causal Representation Learning
by: Li, Haoze, et al.
Published: (2025)
by: Li, Haoze, et al.
Published: (2025)
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
by: Guo, Haiyang, et al.
Published: (2025)
by: Guo, Haiyang, et al.
Published: (2025)
Improving Influence-based Instruction Tuning Data Selection for Balanced Learning of Diverse Capabilities
by: Dai, Qirun, et al.
Published: (2025)
by: Dai, Qirun, et al.
Published: (2025)
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
by: Guo, Siyuan, et al.
Published: (2024)
by: Guo, Siyuan, et al.
Published: (2024)
3DS: Medical Domain Adaptation of LLMs via Decomposed Difficulty-based Data Selection
by: Ding, Hongxin, et al.
Published: (2024)
by: Ding, Hongxin, et al.
Published: (2024)
Uncertainty-Aware Gradient Signal-to-Noise Data Selection for Instruction Tuning
by: Yuan, Zhihang, et al.
Published: (2026)
by: Yuan, Zhihang, et al.
Published: (2026)
Dynamic Bayesian Optimization Framework for Instruction Tuning in Partial Differential Equation Discovery
by: Qu, Junqi, et al.
Published: (2025)
by: Qu, Junqi, et al.
Published: (2025)
Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning
by: Safaei, Bardia, et al.
Published: (2025)
by: Safaei, Bardia, et al.
Published: (2025)
A Comprehensive Survey of Synthetic Tabular Data Generation
by: Shi, Ruxue, et al.
Published: (2025)
by: Shi, Ruxue, et al.
Published: (2025)
Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection
by: Maharana, Adyasha, et al.
Published: (2024)
by: Maharana, Adyasha, et al.
Published: (2024)
Graph-Guided Concept Selection for Efficient Retrieval-Augmented Generation
by: Liu, Ziyu, et al.
Published: (2025)
by: Liu, Ziyu, et al.
Published: (2025)
Similar Items
-
From "What" to "How": Constrained Reasoning for Autoregressive Image Generation
by: Yan, Ruxue, et al.
Published: (2026) -
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
by: Zhang, Jia, et al.
Published: (2025) -
On Representation Redundancy in Large-Scale Instruction Tuning Data Selection
by: Shu, Youwei, et al.
Published: (2026) -
Selective Prompting Tuning for Personalized Conversations with LLMs
by: Huang, Qiushi, et al.
Published: (2024) -
Data Selection for LLM Alignment Using Fine-Grained Preferences
by: Zhang, Jia, et al.
Published: (2025)