:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chang, Ting-Yun, Thomason, Jesse, Jia, Robin
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.13131
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
by: Chang, Ting-Yun, et al.
Published: (2023)

Language Models can Infer Action Semantics for Symbolic Planners from Environment Feedback
by: Zhu, Wang, et al.
Published: (2024)

PDDL-Mind: Large Language Models are Capable on Belief Reasoning with Reliable State Tracking
by: Zhu, Wang Bill, et al.
Published: (2026)

PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models
by: Zhu, Wang Bill, et al.
Published: (2025)

"The Whole Is Greater Than the Sum of Its Parts": A Compatibility-Aware Multi-Teacher CoT Distillation Framework
by: Cui, Jin, et al.
Published: (2026)

Why Do Some Inputs Break Low-Bit LLM Quantization?
by: Chang, Ting-Yun, et al.
Published: (2025)

Adjust for Trust: Mitigating Trust-Induced Inappropriate Reliance on AI Assistance
by: Srinivasan, Tejas, et al.
Published: (2025)

Efficient End-to-End Visual Document Understanding with Rationale Distillation
by: Zhu, Wang, et al.
Published: (2023)

From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
by: Devic, Siddartha, et al.
Published: (2025)

Phonological Representation Learning for Isolated Signs Improves Out-of-Vocabulary Generalization
by: Kezar, Lee, et al.
Published: (2025)

Can VLMs Recall Factual Associations From Visual References?
by: Ashok, Dhananjay, et al.
Published: (2025)

Greater Than the Sum of Its Parts
by: Ferguson, Chris, et al.
Published: (2004)

WinoViz: Probing Visual Properties of Objects Under Different States
by: Jin, Woojeong, et al.
Published: (2024)

Words that make SENSE: Sensorimotor Norms in Learned Lexical Token Representations
by: Gupta, Abhinav, et al.
Published: (2026)

Large Language Models Do Multi-Label Classification Differently
by: Ma, Marcus, et al.
Published: (2025)

TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models
by: Bai, David, et al.
Published: (2024)

Iterative Formalization and Planning in Partially Observable Environments
by: Gong, Liancheng, et al.
Published: (2025)

When Do LLMs Admit Their Mistakes? Understanding The Role Of Model Belief In Retraction
by: Yang, Yuqing, et al.
Published: (2025)

More Than Sum of Its Parts: Deciphering Intent Shifts in Multimodal Hate Speech Detection
by: Sun, Runze, et al.
Published: (2026)

Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations
by: He, Keyu, et al.
Published: (2025)

Generating Contextually-Relevant Navigation Instructions for Blind and Low Vision People
by: Merchant, Zain, et al.
Published: (2024)

Breaking the Language Barrier: Can Direct Inference Outperform Pre-Translation in Multilingual LLM Applications?
by: Intrator, Yotam, et al.
Published: (2024)

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration
by: Shi, Quan, et al.
Published: (2025)

Adapted Large Language Models Can Outperform Medical Experts in Clinical Text Summarization
by: Van Veen, Dave, et al.
Published: (2023)

The Sum Leaks More Than Its Parts: Compositional Privacy Risks and Mitigations in Multi-Agent Collaboration
by: Patil, Vaidehi, et al.
Published: (2025)

Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
by: Cao, Jinghan, et al.
Published: (2026)

The American Sign Language Knowledge Graph: Infusing ASL Models with Linguistic Knowledge
by: Kezar, Lee, et al.
Published: (2024)

Can LLM Teams Play What? Where? When?
by: Kotelnikova, Anastasia, et al.
Published: (2026)

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
by: Kunstner, Frederik, et al.
Published: (2024)

Enhancing Reasoning Skills in Small Persian Medical Language Models Can Outperform Large-Scale Data Training
by: Ghassabi, Mehrdad, et al.
Published: (2025)

LiveOIBench: Can Large Language Models Outperform Human Contestants in Informatics Olympiads?
by: Zou, Kaijian, et al.
Published: (2025)

Meaningful Products: Making the Whole Greater Than the Sum of the Parts
by: Jansen, Barbara A.
Published: (2005)

InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence Insertion Problem?
by: Wu, Fang, et al.
Published: (2021)

Still Not There: Can LLMs Outperform Smaller Task-Specific Seq2Seq Models on the Poetry-to-Prose Conversion Task?
by: Das, Kunal Kingkar, et al.
Published: (2025)

STALE: Can LLM Agents Know When Their Memories Are No Longer Valid?
by: Chao, Hanxiang, et al.
Published: (2026)

Benchmarks Saturate When The Model Gets Smarter Than The Judge
by: Ballon, Marthe, et al.
Published: (2026)

Can VLM Pseudo-Labels Train a Time-Series QA Model That Outperforms the VLM?
by: Fujimura, Takuya, et al.
Published: (2025)

Can Large Language Models Outperform Non-Experts in Poetry Evaluation? A Comparative Study Using the Consensual Assessment Technique
by: Sawicki, Piotr, et al.
Published: (2025)

Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
by: Mitra, Chancharik, et al.
Published: (2023)

When Inverse Data Outperforms: Exploring the Pitfalls of Mixed Data in Multi-Stage Fine-Tuning
by: Deng, Mengyi, et al.
Published: (2025)