Saved in:
| Main Authors: | Jiao, Cathy, Pan, Yijun, Xiao, Emily, Sheng, Daisy, Jain, Niket, Zhao, Hanzhang, Dasgupta, Ishita, Ma, Jiaqi W., Xiong, Chenyan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.09424 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the Feasibility of In-Context Probing for Data Attribution
by: Jiao, Cathy, et al.
Published: (2024)
by: Jiao, Cathy, et al.
Published: (2024)
Fairshare Data Pricing via Data Valuation for Large Language Models
by: Zhang, Luyang, et al.
Published: (2025)
by: Zhang, Luyang, et al.
Published: (2025)
An Economic Framework for Generative Engines: Advertising or Subscription?
by: Zhang, Luyang, et al.
Published: (2026)
by: Zhang, Luyang, et al.
Published: (2026)
Detecting and Filtering Unsafe Training Data via Data Attribution with Denoised Representation
by: Pan, Yijun, et al.
Published: (2025)
by: Pan, Yijun, et al.
Published: (2025)
ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research Agents
by: Kang, Hao, et al.
Published: (2024)
by: Kang, Hao, et al.
Published: (2024)
Generating Pretraining Tokens from Organic Data for Data-Bound Scaling
by: Yu, Zichun, et al.
Published: (2026)
by: Yu, Zichun, et al.
Published: (2026)
How do Large Language Models Navigate Conflicts between Honesty and Helpfulness?
by: Liu, Ryan, et al.
Published: (2024)
by: Liu, Ryan, et al.
Published: (2024)
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining
by: Yu, Zichun, et al.
Published: (2025)
by: Yu, Zichun, et al.
Published: (2025)
AgentWebBench: Benchmarking Multi-Agent Coordination in Agentic Web
by: Zhong, Shanshan, et al.
Published: (2026)
by: Zhong, Shanshan, et al.
Published: (2026)
$\texttt{dattri}$: A Library for Efficient Data Attribution
by: Deng, Junwei, et al.
Published: (2024)
by: Deng, Junwei, et al.
Published: (2024)
MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models
by: Yu, Zichun, et al.
Published: (2024)
by: Yu, Zichun, et al.
Published: (2024)
Evaluating and Improving Graph to Text Generation with Large Language Models
by: He, Jie, et al.
Published: (2025)
by: He, Jie, et al.
Published: (2025)
A geometric flow on noncompact affine Riemannian manifolds
by: Jiao, Heming, et al.
Published: (2024)
by: Jiao, Heming, et al.
Published: (2024)
Daunce: Data Attribution through Uncertainty Estimation
by: Pan, Xingyuan, et al.
Published: (2025)
by: Pan, Xingyuan, et al.
Published: (2025)
Task Priors: Enhancing Model Evaluation by Considering the Entire Space of Downstream Tasks
by: Patel, Niket, et al.
Published: (2025)
by: Patel, Niket, et al.
Published: (2025)
Handling Missing Responses under Cluster Dependence with Applications to Language Model Evaluation
by: Zeng, Zhenghao, et al.
Published: (2025)
by: Zeng, Zhenghao, et al.
Published: (2025)
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
by: Li, Xiaochuan, et al.
Published: (2024)
by: Li, Xiaochuan, et al.
Published: (2024)
PulseLM: A Foundation Dataset and Benchmark for PPG-Text Learning
by: Pham, Hung Manh, et al.
Published: (2026)
by: Pham, Hung Manh, et al.
Published: (2026)
A mark and recapture perspective on vaccination touchpoints
by: Thakkar, Niket
Published: (2025)
by: Thakkar, Niket
Published: (2025)
WorldValuesBench: A Large-Scale Benchmark Dataset for Multi-Cultural Value Awareness of Language Models
by: Zhao, Wenlong, et al.
Published: (2024)
by: Zhao, Wenlong, et al.
Published: (2024)
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models
by: Kang, Hao, et al.
Published: (2025)
by: Kang, Hao, et al.
Published: (2025)
AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning
by: Wang, Tevin, et al.
Published: (2025)
by: Wang, Tevin, et al.
Published: (2025)
First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning
by: Jain, Kushal, et al.
Published: (2023)
by: Jain, Kushal, et al.
Published: (2023)
THE EURO AREA ENLARGEMENT: THE TARGET DATE PROBLEM
by: Arūnas Dulkys
Published: (2009)
by: Arūnas Dulkys
Published: (2009)
THE DATE OF PUBLICATION OF ANTONS VERZEICHNISS DER CONCHYLIEN
by: Cernohorsky, Walter Oliver.
Published: (1978)
by: Cernohorsky, Walter Oliver.
Published: (1978)
Respond Beyond Language: A Benchmark for Video Generation in Response to Realistic User Intents
by: Wang, Shuting, et al.
Published: (2025)
by: Wang, Shuting, et al.
Published: (2025)
Adversarial Attacks on Data Attribution
by: Wang, Xinhe, et al.
Published: (2024)
by: Wang, Xinhe, et al.
Published: (2024)
Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph
by: Vashurin, Roman, et al.
Published: (2024)
by: Vashurin, Roman, et al.
Published: (2024)
KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
Attributes of Destination Competitiveness for Island Tourism: Application of Text Data Mining
by: Arum Park, et al.
Published: (2026)
by: Arum Park, et al.
Published: (2026)
Attribution Bias in Large Language Models
by: Berman, Eliza, et al.
Published: (2026)
by: Berman, Eliza, et al.
Published: (2026)
Evolution of Benchmark: Black-Box Optimization Benchmark Design through Large Language Model
by: Wang, Chen, et al.
Published: (2026)
by: Wang, Chen, et al.
Published: (2026)
Source Attribution for Large Language Model-Generated Data
by: Wang, Jingtan, et al.
Published: (2023)
by: Wang, Jingtan, et al.
Published: (2023)
Talent or Luck? Evaluating Attribution Bias in Large Language Models
by: Raj, Chahat, et al.
Published: (2025)
by: Raj, Chahat, et al.
Published: (2025)
Invariant Features in Language Models: Geometric Characterization and Model Attribution
by: Dasgupta, Agnibh, et al.
Published: (2026)
by: Dasgupta, Agnibh, et al.
Published: (2026)
Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews
by: Liu, Mengqiao, et al.
Published: (2025)
by: Liu, Mengqiao, et al.
Published: (2025)
Evaluating Attribute Comprehension in Large Vision-Language Models
by: Zhang, Haiwen, et al.
Published: (2024)
by: Zhang, Haiwen, et al.
Published: (2024)
Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution
by: Li, Xinze, et al.
Published: (2023)
by: Li, Xinze, et al.
Published: (2023)
SAR-LM: Symbolic Audio Reasoning with Large Language Models
by: Taheri, Termeh, et al.
Published: (2025)
by: Taheri, Termeh, et al.
Published: (2025)
The in-context inductive biases of vision-language models differ across modalities
by: Allen, Kelsey, et al.
Published: (2025)
by: Allen, Kelsey, et al.
Published: (2025)
Similar Items
-
On the Feasibility of In-Context Probing for Data Attribution
by: Jiao, Cathy, et al.
Published: (2024) -
Fairshare Data Pricing via Data Valuation for Large Language Models
by: Zhang, Luyang, et al.
Published: (2025) -
An Economic Framework for Generative Engines: Advertising or Subscription?
by: Zhang, Luyang, et al.
Published: (2026) -
Detecting and Filtering Unsafe Training Data via Data Attribution with Denoised Representation
by: Pan, Yijun, et al.
Published: (2025) -
ResearchArena: Benchmarking Large Language Models' Ability to Collect and Organize Information as Research Agents
by: Kang, Hao, et al.
Published: (2024)