Saved in:
| Main Authors: | Ko, Myeongseob, Kang, Feiyang, Shi, Weiyan, Jin, Ming, Yu, Zhou, Jia, Ruoxi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.08922 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Characterizing Model-Native Skills
by: Kang, Feiyang, et al.
Published: (2026)
by: Kang, Feiyang, et al.
Published: (2026)
The Signal is in the Steps: Local Scoring for Reasoning Data Selection
by: Just, Hoang Anh, et al.
Published: (2025)
by: Just, Hoang Anh, et al.
Published: (2025)
Probing Knowledge Holes in Unlearned LLMs
by: Ko, Myeongseob, et al.
Published: (2025)
by: Ko, Myeongseob, et al.
Published: (2025)
Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver
by: Patsenker, Jonathan, et al.
Published: (2025)
by: Patsenker, Jonathan, et al.
Published: (2025)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
by: Ko, Myeongseob, et al.
Published: (2024)
by: Ko, Myeongseob, et al.
Published: (2024)
Capturing the Temporal Dependence of Training Data Influence
by: Wang, Jiachen T., et al.
Published: (2024)
by: Wang, Jiachen T., et al.
Published: (2024)
AdaDeDup: Adaptive Hybrid Data Pruning for Efficient Large-Scale Object Detection Training
by: Kang, Feiyang, et al.
Published: (2025)
by: Kang, Feiyang, et al.
Published: (2025)
Retracing the Past: LLMs Emit Training Data When They Get Lost
by: Ko, Myeongseob, et al.
Published: (2025)
by: Ko, Myeongseob, et al.
Published: (2025)
AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs
by: Kang, Feiyang, et al.
Published: (2024)
by: Kang, Feiyang, et al.
Published: (2024)
A Sustainable AI Economy Needs Data Deals That Work for Generators
by: Jia, Ruoxi, et al.
Published: (2026)
by: Jia, Ruoxi, et al.
Published: (2026)
Accumulative SGD Influence Estimation for Data Attribution
by: Shi, Yunxiao, et al.
Published: (2025)
by: Shi, Yunxiao, et al.
Published: (2025)
Data-Centric Human Preference with Rationales for Direct Preference Alignment
by: Just, Hoang Anh, et al.
Published: (2024)
by: Just, Hoang Anh, et al.
Published: (2024)
Efficient Data Shapley for Weighted Nearest Neighbor Algorithms
by: Wang, Jiachen T., et al.
Published: (2024)
by: Wang, Jiachen T., et al.
Published: (2024)
DCFold: Efficient Protein Structure Generation with Single Forward Pass
by: Zhang, Zhe, et al.
Published: (2026)
by: Zhang, Zhe, et al.
Published: (2026)
f-INE: A Hypothesis Testing Framework for Estimating Influence under Training Randomness
by: Panda, Subhodip, et al.
Published: (2025)
by: Panda, Subhodip, et al.
Published: (2025)
Memory-Induced Tool-Drift in LLM Agents
by: Dabas, Mahavir, et al.
Published: (2026)
by: Dabas, Mahavir, et al.
Published: (2026)
Get more for less: Principled Data Selection for Warming Up Fine-Tuning in LLMs
by: Kang, Feiyang, et al.
Published: (2024)
by: Kang, Feiyang, et al.
Published: (2024)
DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models
by: Kwon, Yongchan, et al.
Published: (2023)
by: Kwon, Yongchan, et al.
Published: (2023)
The Convergence Gap: Instruction-Tuned Language Models Stabilize Later in the Forward Pass
by: Zhou, Yifan
Published: (2026)
by: Zhou, Yifan
Published: (2026)
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation
by: Zhou, Xinyu, et al.
Published: (2024)
by: Zhou, Xinyu, et al.
Published: (2024)
Quagmires in SFT-RL Post-Training: When High SFT Scores Mislead and What to Use Instead
by: Kang, Feiyang, et al.
Published: (2025)
by: Kang, Feiyang, et al.
Published: (2025)
The Approximate Fisher Influence Function: Faster Estimation of Data Influence in Statistical Models
by: Lev, Omri, et al.
Published: (2024)
by: Lev, Omri, et al.
Published: (2024)
Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?
by: He, Jianfeng, et al.
Published: (2024)
by: He, Jianfeng, et al.
Published: (2024)
Training Data Influence Analysis and Estimation: A Survey
by: Hammoudeh, Zayd, et al.
Published: (2022)
by: Hammoudeh, Zayd, et al.
Published: (2022)
Influence Dynamics and Stagewise Data Attribution
by: Lee, Jin Hwa, et al.
Published: (2025)
by: Lee, Jin Hwa, et al.
Published: (2025)
Influence Strength Estimation in Hyperbolic Space for Social Influence Maximization
by: Qiao, Hongliang, et al.
Published: (2025)
by: Qiao, Hongliang, et al.
Published: (2025)
MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
by: Yu, Zichun, et al.
Published: (2024)
by: Yu, Zichun, et al.
Published: (2024)
Just Enough Shifts: Mitigating Over-Refusal in Aligned Language Models with Targeted Representation Fine-Tuning
by: Dabas, Mahavir, et al.
Published: (2025)
by: Dabas, Mahavir, et al.
Published: (2025)
Test-Time Model Adaptation with Only Forward Passes
by: Niu, Shuaicheng, et al.
Published: (2024)
by: Niu, Shuaicheng, et al.
Published: (2024)
Automated Efficient Estimation using Monte Carlo Efficient Influence Functions
by: Agrawal, Raj, et al.
Published: (2024)
by: Agrawal, Raj, et al.
Published: (2024)
Influence Functions for Efficient Data Selection in Reasoning
by: Humane, Prateek, et al.
Published: (2025)
by: Humane, Prateek, et al.
Published: (2025)
Layer-Aware Influence for Online Data Valuation Estimation
by: Yang, Ziao, et al.
Published: (2025)
by: Yang, Ziao, et al.
Published: (2025)
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
by: Sel, Bilgehan, et al.
Published: (2024)
by: Sel, Bilgehan, et al.
Published: (2024)
From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents
by: Ko, Myeongseob, et al.
Published: (2026)
by: Ko, Myeongseob, et al.
Published: (2026)
Demystifying Synthetic Data in LLM Pre-training: A Systematic Study of Scaling Laws, Benefits, and Pitfalls
by: Kang, Feiyang, et al.
Published: (2025)
by: Kang, Feiyang, et al.
Published: (2025)
DiPT: Enhancing LLM reasoning through diversified perspective-taking
by: Just, Hoang Anh, et al.
Published: (2024)
by: Just, Hoang Anh, et al.
Published: (2024)
Data-Efficient RLVR via Off-Policy Influence Guidance
by: Zhu, Erle, et al.
Published: (2025)
by: Zhu, Erle, et al.
Published: (2025)
Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation
by: Xie, Tong, et al.
Published: (2024)
by: Xie, Tong, et al.
Published: (2024)
Fine-Tuning Language Models with Just Forward Passes
by: Malladi, Sadhika, et al.
Published: (2023)
by: Malladi, Sadhika, et al.
Published: (2023)
Efficient Data Selection at Scale via Influence Distillation
by: Nikdan, Mahdi, et al.
Published: (2025)
by: Nikdan, Mahdi, et al.
Published: (2025)
Similar Items
-
Characterizing Model-Native Skills
by: Kang, Feiyang, et al.
Published: (2026) -
The Signal is in the Steps: Local Scoring for Reasoning Data Selection
by: Just, Hoang Anh, et al.
Published: (2025) -
Probing Knowledge Holes in Unlearned LLMs
by: Ko, Myeongseob, et al.
Published: (2025) -
Injecting Measurement Information Yields a Fast and Noise-Robust Diffusion-Based Inverse Problem Solver
by: Patsenker, Jonathan, et al.
Published: (2025) -
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models
by: Ko, Myeongseob, et al.
Published: (2024)