Saved in:
| Main Authors: | He, Jacqueline, Yen, Howard, Li, Margaret, Li, Shuyue Stella, Zeng, Zhiyuan, Shi, Weijia, Tsvetkov, Yulia, Chen, Danqi, Koh, Pang Wei, Zettlemoyer, Luke |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.06589 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning
by: Li, Shuyue Stella, et al.
Published: (2024)
by: Li, Shuyue Stella, et al.
Published: (2024)
PrefDisco: Benchmarking Proactive Personalized Reasoning
by: Li, Shuyue Stella, et al.
Published: (2025)
by: Li, Shuyue Stella, et al.
Published: (2025)
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
by: Shao, Rulin, et al.
Published: (2024)
by: Shao, Rulin, et al.
Published: (2024)
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic Questioning
by: Taranukhin, Maksym, et al.
Published: (2026)
by: Taranukhin, Maksym, et al.
Published: (2026)
Teaching LLMs to Abstain across Languages via Multilingual Feedback
by: Feng, Shangbin, et al.
Published: (2024)
by: Feng, Shangbin, et al.
Published: (2024)
A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage
by: Xin, Rui, et al.
Published: (2025)
by: Xin, Rui, et al.
Published: (2025)
Spurious Rewards: Rethinking Training Signals in RLVR
by: Shao, Rulin, et al.
Published: (2025)
by: Shao, Rulin, et al.
Published: (2025)
Self-Improving VLM Judges Without Human Annotations
by: Lin, Inna Wanyin, et al.
Published: (2025)
by: Lin, Inna Wanyin, et al.
Published: (2025)
Cold-Start Personalization via Training-Free Priors from Structured World Models
by: Bose, Avinandan, et al.
Published: (2026)
by: Bose, Avinandan, et al.
Published: (2026)
Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest
by: Wu, Addison J., et al.
Published: (2026)
by: Wu, Addison J., et al.
Published: (2026)
Data Swarms: Optimizable Generation of Synthetic Evaluation Data
by: Feng, Shangbin, et al.
Published: (2025)
by: Feng, Shangbin, et al.
Published: (2025)
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations
by: Han, Xiaochuang, et al.
Published: (2024)
by: Han, Xiaochuang, et al.
Published: (2024)
Long-Context Language Modeling with Parallel Context Encoding
by: Yen, Howard, et al.
Published: (2024)
by: Yen, Howard, et al.
Published: (2024)
Deep Reasoning in General Purpose Agents via Structured Meta-Cognition
by: Light, Dean, et al.
Published: (2026)
by: Light, Dean, et al.
Published: (2026)
Do Membership Inference Attacks Work on Large Language Models?
by: Duan, Michael, et al.
Published: (2024)
by: Duan, Michael, et al.
Published: (2024)
Reliable, Adaptable, and Attributable Language Models with Retrieval
by: Asai, Akari, et al.
Published: (2024)
by: Asai, Akari, et al.
Published: (2024)
Anchored Decoding: Provably Reducing Copyright Risk for Any Language Model
by: He, Jacqueline, et al.
Published: (2026)
by: He, Jacqueline, et al.
Published: (2026)
RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
by: Zeng, Zhiyuan, et al.
Published: (2025)
by: Zeng, Zhiyuan, et al.
Published: (2025)
Detecting Pretraining Data from Large Language Models
by: Shi, Weijia, et al.
Published: (2023)
by: Shi, Weijia, et al.
Published: (2023)
EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics
by: Li, Shuyue Stella, et al.
Published: (2026)
by: Li, Shuyue Stella, et al.
Published: (2026)
ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data
by: Chen, Tong, et al.
Published: (2025)
by: Chen, Tong, et al.
Published: (2025)
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models
by: Feng, Shangbin, et al.
Published: (2023)
by: Feng, Shangbin, et al.
Published: (2023)
PrefPalette: Personalized Preference Modeling with Latent Attributes
by: Li, Shuyue Stella, et al.
Published: (2025)
by: Li, Shuyue Stella, et al.
Published: (2025)
How to Train Long-Context Language Models (Effectively)
by: Gao, Tianyu, et al.
Published: (2024)
by: Gao, Tianyu, et al.
Published: (2024)
Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language Models
by: Pang, Rock Yuren, et al.
Published: (2025)
by: Pang, Rock Yuren, et al.
Published: (2025)
Agentic Aggregation for Parallel Scaling of Long-Horizon Agentic Tasks
by: Lee, Yoonsang, et al.
Published: (2026)
by: Lee, Yoonsang, et al.
Published: (2026)
ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions
by: Park, Chan Young, et al.
Published: (2024)
by: Park, Chan Young, et al.
Published: (2024)
Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems
by: Feng, Shangbin, et al.
Published: (2025)
by: Feng, Shangbin, et al.
Published: (2025)
(Mis)Fitting: A Survey of Scaling Laws
by: Li, Margaret, et al.
Published: (2025)
by: Li, Margaret, et al.
Published: (2025)
Fantastic Copyrighted Beasts and How (Not) to Generate Them
by: He, Luxi, et al.
Published: (2024)
by: He, Luxi, et al.
Published: (2024)
Resolving Knowledge Conflicts in Large Language Models
by: Wang, Yike, et al.
Published: (2023)
by: Wang, Yike, et al.
Published: (2023)
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
by: Feng, Shangbin, et al.
Published: (2024)
by: Feng, Shangbin, et al.
Published: (2024)
Negative Token Merging: Image-based Adversarial Feature Guidance
by: Singh, Jaskirat, et al.
Published: (2024)
by: Singh, Jaskirat, et al.
Published: (2024)
Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch
by: Kim, Hyunwoo, et al.
Published: (2026)
by: Kim, Hyunwoo, et al.
Published: (2026)
HorizonBench: Long-Horizon Personalization with Evolving Preferences
by: Li, Shuyue Stella, et al.
Published: (2026)
by: Li, Shuyue Stella, et al.
Published: (2026)
DySCO: Dynamic Attention-Scaling Decoding for Long-Context Language Models
by: Ye, Xi, et al.
Published: (2026)
by: Ye, Xi, et al.
Published: (2026)
Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
by: Zhang, Wuwei, et al.
Published: (2025)
by: Zhang, Wuwei, et al.
Published: (2025)
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
by: Ye, Xi, et al.
Published: (2025)
by: Ye, Xi, et al.
Published: (2025)
Slicing and Dicing: Configuring Optimal Mixtures of Experts
by: Li, Margaret, et al.
Published: (2026)
by: Li, Margaret, et al.
Published: (2026)
CopyBench: Measuring Literal and Non-Literal Reproduction of Copyright-Protected Text in Language Model Generation
by: Chen, Tong, et al.
Published: (2024)
by: Chen, Tong, et al.
Published: (2024)
Similar Items
-
MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning
by: Li, Shuyue Stella, et al.
Published: (2024) -
PrefDisco: Benchmarking Proactive Personalized Reasoning
by: Li, Shuyue Stella, et al.
Published: (2025) -
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
by: Shao, Rulin, et al.
Published: (2024) -
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic Questioning
by: Taranukhin, Maksym, et al.
Published: (2026) -
Teaching LLMs to Abstain across Languages via Multilingual Feedback
by: Feng, Shangbin, et al.
Published: (2024)