Saved in:
| Main Authors: | Chang, Ting-Yun, Thomason, Jesse, Jia, Robin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.13131 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
by: Chang, Ting-Yun, et al.
Published: (2023)
by: Chang, Ting-Yun, et al.
Published: (2023)
Language Models can Infer Action Semantics for Symbolic Planners from Environment Feedback
by: Zhu, Wang, et al.
Published: (2024)
by: Zhu, Wang, et al.
Published: (2024)
PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking
by: Zhu, Wang Bill, et al.
Published: (2026)
by: Zhu, Wang Bill, et al.
Published: (2026)
PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models
by: Zhu, Wang Bill, et al.
Published: (2025)
by: Zhu, Wang Bill, et al.
Published: (2025)
"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework
by: Cui, Jin, et al.
Published: (2026)
by: Cui, Jin, et al.
Published: (2026)
Why Do Some Inputs Break Low-Bit LLM Quantization?
by: Chang, Ting-Yun, et al.
Published: (2025)
by: Chang, Ting-Yun, et al.
Published: (2025)
Adjust for Trust: Mitigating Trust-Induced Inappropriate Reliance on AI Assistance
by: Srinivasan, Tejas, et al.
Published: (2025)
by: Srinivasan, Tejas, et al.
Published: (2025)
Efficient End-to-End Visual Document Understanding with Rationale Distillation
by: Zhu, Wang, et al.
Published: (2023)
by: Zhu, Wang, et al.
Published: (2023)
From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
by: Devic, Siddartha, et al.
Published: (2025)
by: Devic, Siddartha, et al.
Published: (2025)
Phonological Representation Learning for Isolated Signs Improves Out-of-Vocabulary Generalization
by: Kezar, Lee, et al.
Published: (2025)
by: Kezar, Lee, et al.
Published: (2025)
Can VLMs Recall Factual Associations From Visual References?
by: Ashok, Dhananjay, et al.
Published: (2025)
by: Ashok, Dhananjay, et al.
Published: (2025)
Greater Than the Sum of Its Parts
by: Ferguson, Chris, et al.
Published: (2004)
by: Ferguson, Chris, et al.
Published: (2004)
WinoViz: Probing Visual Properties of Objects Under Different States
by: Jin, Woojeong, et al.
Published: (2024)
by: Jin, Woojeong, et al.
Published: (2024)
Words that make SENSE: Sensorimotor Norms in Learned Lexical Token Representations
by: Gupta, Abhinav, et al.
Published: (2026)
by: Gupta, Abhinav, et al.
Published: (2026)
Large Language Models Do Multi-Label Classification Differently
by: Ma, Marcus, et al.
Published: (2025)
by: Ma, Marcus, et al.
Published: (2025)
TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models
by: Bai, David, et al.
Published: (2024)
by: Bai, David, et al.
Published: (2024)
Iterative Formalization and Planning in Partially Observable Environments
by: Gong, Liancheng, et al.
Published: (2025)
by: Gong, Liancheng, et al.
Published: (2025)
When Do LLMs Admit Their Mistakes? Understanding The Role Of Model Belief In Retraction
by: Yang, Yuqing, et al.
Published: (2025)
by: Yang, Yuqing, et al.
Published: (2025)
More Than Sum of Its Parts: Deciphering Intent Shifts in Multimodal Hate Speech Detection
by: Sun, Runze, et al.
Published: (2026)
by: Sun, Runze, et al.
Published: (2026)
Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations
by: He, Keyu, et al.
Published: (2025)
by: He, Keyu, et al.
Published: (2025)
Generating Contextually-Relevant Navigation Instructions for Blind and Low Vision People
by: Merchant, Zain, et al.
Published: (2024)
by: Merchant, Zain, et al.
Published: (2024)
Breaking the Language Barrier: Can Direct Inference Outperform Pre-Translation in Multilingual LLM Applications?
by: Intrator, Yotam, et al.
Published: (2024)
by: Intrator, Yotam, et al.
Published: (2024)
When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
by: Shi, Quan, et al.
Published: (2025)
by: Shi, Quan, et al.
Published: (2025)
Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization
by: Van Veen, Dave, et al.
Published: (2023)
by: Van Veen, Dave, et al.
Published: (2023)
The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration
by: Patil, Vaidehi, et al.
Published: (2025)
by: Patil, Vaidehi, et al.
Published: (2025)
Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
by: Cao, Jinghan, et al.
Published: (2026)
by: Cao, Jinghan, et al.
Published: (2026)
The American Sign Language Knowledge Graph: Infusing ASL Models with Linguistic Knowledge
by: Kezar, Lee, et al.
Published: (2024)
by: Kezar, Lee, et al.
Published: (2024)
Can LLM Teams Play What? Where? When?
by: Kotelnikova, Anastasia, et al.
Published: (2026)
by: Kotelnikova, Anastasia, et al.
Published: (2026)
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
by: Kunstner, Frederik, et al.
Published: (2024)
by: Kunstner, Frederik, et al.
Published: (2024)
Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
by: Ghassabi, Mehrdad, et al.
Published: (2025)
by: Ghassabi, Mehrdad, et al.
Published: (2025)
LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
by: Zou, Kaijian, et al.
Published: (2025)
by: Zou, Kaijian, et al.
Published: (2025)
Meaningful Products: Making the Whole Greater Than the Sum of the Parts
by: Jansen, Barbara A.
Published: (2005)
by: Jansen, Barbara A.
Published: (2005)
InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence Insertion Problem?
by: Wu, Fang, et al.
Published: (2021)
by: Wu, Fang, et al.
Published: (2021)
Still Not There: Can LLMs Outperform Smaller Task-Specific Seq2Seq Models on the Poetry-to-Prose Conversion Task?
by: Das, Kunal Kingkar, et al.
Published: (2025)
by: Das, Kunal Kingkar, et al.
Published: (2025)
STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
by: Chao, Hanxiang, et al.
Published: (2026)
by: Chao, Hanxiang, et al.
Published: (2026)
Benchmarks Saturate When The Model Gets Smarter Than The Judge
by: Ballon, Marthe, et al.
Published: (2026)
by: Ballon, Marthe, et al.
Published: (2026)
Can VLM Pseudo-Labels Train a Time-Series QA Model That Outperforms the VLM?
by: Fujimura, Takuya, et al.
Published: (2025)
by: Fujimura, Takuya, et al.
Published: (2025)
Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique
by: Sawicki, Piotr, et al.
Published: (2025)
by: Sawicki, Piotr, et al.
Published: (2025)
Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
by: Mitra, Chancharik, et al.
Published: (2023)
by: Mitra, Chancharik, et al.
Published: (2023)
When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
by: Deng, Mengyi, et al.
Published: (2025)
by: Deng, Mengyi, et al.
Published: (2025)
Similar Items
-
Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
by: Chang, Ting-Yun, et al.
Published: (2023) -
Language Models can Infer Action Semantics for Symbolic Planners from Environment Feedback
by: Zhu, Wang, et al.
Published: (2024) -
PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking
by: Zhu, Wang Bill, et al.
Published: (2026) -
PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models
by: Zhu, Wang Bill, et al.
Published: (2025) -
"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework
by: Cui, Jin, et al.
Published: (2026)