:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Berger, Uri, Baumel, Tal, Stanovsky, Gabriel
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2406.13274
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games
by: Eckhaus, Niv, et al.
Published: (2025)

Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy, Trends and Metrics Analysis
by: Berger, Uri, et al.
Published: (2024)

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time
by: Berger, Uri, et al.
Published: (2025)

SAUCE: Synchronous and Asynchronous User-Customizable Environment for Multi-Agent LLM Interaction
by: Neuberger, Shlomo, et al.
Published: (2024)

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs
by: Itzhak, Itay, et al.
Published: (2025)

The State and Fate of Summarization Datasets: A Survey
by: Dahan, Noam, et al.
Published: (2024)

Comparing Humans and Models on a Similar Scale: Towards Cognitive Gender Bias Evaluation in Coreference Resolution
by: Lior, Gili, et al.
Published: (2023)

Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition
by: Goldstein, Ariel, et al.
Published: (2024)

Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
by: Lioubashevski, Daria, et al.
Published: (2024)

Multilingual Large Language Models and Curse of Multilinguality
by: Gurgurov, Daniil, et al.
Published: (2024)

Leveraging Collection-Wide Similarities for Unsupervised Document Structure Extraction
by: Lior, Gili, et al.
Published: (2024)

Leveraging Digitized Newspapers to Collect Summarization Data in Low-Resource Languages
by: Dahan, Noam, et al.
Published: (2025)

Beyond Memorization: Distinguishing between Reductive and Epistemic Reasoning in LLMs using Classic Logic Puzzles
by: Gabay, Adi, et al.
Published: (2026)

Controllable Synthetic Clinical Note Generation with Privacy Guarantees
by: Baumel, Tal, et al.
Published: (2024)

Comparing the Framing Effect in Humans and LLMs on Naturally Occurring Texts
by: Lior, Gili, et al.
Published: (2025)

PromptSuite: A Task-Agnostic Framework for Multi-Prompt Generation
by: Habba, Eliya, et al.
Published: (2025)

Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
by: Iluz, Bar, et al.
Published: (2024)

PRISM: PRIor from corpus Statistics for topic Modeling
by: Ishon, Tal, et al.
Published: (2026)

Cross-Lingual and Cross-Cultural Variation in Image Descriptions
by: Berger, Uri, et al.
Published: (2024)

ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments
by: Lior, Gili, et al.
Published: (2025)

Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs
by: Park, Jungsoo, et al.
Published: (2025)

More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG
by: Levy, Shahar, et al.
Published: (2025)

Anticipatory Evaluation of Language Models
by: Park, Jungsoo, et al.
Published: (2025)

Beyond Benchmarks: On The False Promise of AI Regulation
by: Stanovsky, Gabriel, et al.
Published: (2025)

From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs
by: Itzhak, Itay, et al.
Published: (2026)

SEAM: A Stochastic Benchmark for Multi-Document Tasks
by: Lior, Gili, et al.
Published: (2024)

State of What Art? A Call for Multi-Prompt LLM Evaluation
by: Mizrahi, Moran, et al.
Published: (2023)

In-Context Learning with Long-Context Models: An In-Depth Exploration
by: Bertsch, Amanda, et al.
Published: (2024)

A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns
by: Yehudai, Asaf, et al.
Published: (2024)

Learning Metadata-Agnostic Representations for Text-to-SQL In-Context Example Selection
by: Mai, Chuhong, et al.
Published: (2024)

ScheMatiQ: From Research Question to Structured Data through Interactive Schema Discovery
by: Levy, Shahar, et al.
Published: (2026)

Schema-Driven Information Extraction from Heterogeneous Tables
by: Bai, Fan, et al.
Published: (2023)

Trust Me, I'm Wrong: LLMs Hallucinate with Certainty Despite Knowing the Answer
by: Simhi, Adi, et al.
Published: (2025)

Emotion Classification In-Context in Spanish
by: Thapa, Bipul, et al.
Published: (2025)

Cooking Up Creativity: Enhancing LLM Creativity through Structured Recombination
by: Mizrahi, Moran, et al.
Published: (2025)

Token-Budget-Aware LLM Reasoning
by: Han, Tingxu, et al.
Published: (2024)

Evaluating In-Context Translation with Synchronous Context-Free Grammar Transduction
by: Petty, Jackson, et al.
Published: (2026)

In-context Learning Generalizes, But Not Always Robustly: The Case of Syntax
by: Mueller, Aaron, et al.
Published: (2023)

Language Models Struggle to Use Representations Learned In-Context
by: Lepori, Michael A., et al.
Published: (2026)

K-QA: A Real-World Medical Q&A Benchmark
by: Manes, Itay, et al.
Published: (2024)