:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Matlin, Glenn, Zhang, Devin, Loza, Rodrigo Barroso, Popescu, Diana M., Isbell, Joni, Chakraborty, Chandreyi, Riedl, Mark
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2508.15794
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Creating Suspenseful Stories: Iterative Planning with Large Language Models
por: Xie, Kaige, et al.
Publicado: (2024)

On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks
por: Gupta, Aarav, et al.
Publicado: (2026)

Finance Language Model Evaluation (FLaME)
por: Matlin, Glenn, et al.
Publicado: (2025)

Agree to Agree
Publicado: (2020)

An Update to Methods Showcase Articles in Language Learning
por: Daniel R. Isbell
Publicado: (2026)

Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
por: Rabeyah, Abdullah Al, et al.
Publicado: (2024)

Financial Instruction Following Evaluation (FIFE)
por: Matlin, Glenn, et al.
Publicado: (2025)

Agreeing to Interact in Human-Robot Interaction using Large Language Models and Vision Language Models
por: Sasabuchi, Kazuhiro, et al.
Publicado: (2025)

Do These LLM Benchmarks Agree? Fixing Benchmark Evaluation with BenchBench
por: Perlitz, Yotam, et al.
Publicado: (2024)

Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation
por: Chhun, Cyril, et al.
Publicado: (2024)

Making Large Language Models into World Models with Precondition and Effect Knowledge
por: Xie, Kaige, et al.
Publicado: (2024)

Conversational Agents and the Understanding of Human Language: Reflections on AI, LLMs, and Cognitive Science
por: Popescu-Belis, Andrei
Publicado: (2026)

Shall We Play a Game? Language Models for Open-ended Wargames
por: Matlin, Glenn, et al.
Publicado: (2025)

Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations
por: Hong, Pingjun, et al.
Publicado: (2025)

AgreeMate: Teaching LLMs to Haggle
por: Chatterjee, Ainesh, et al.
Publicado: (2024)

Perceptions of Linguistic Uncertainty by Language Models and Humans
por: Belem, Catarina G, et al.
Publicado: (2024)

When Annotators Agree but Labels Disagree: The Projection Problem in Stance Detection
por: Zhang, Bowen
Publicado: (2026)

Agree to Disagree? A Meta-Evaluation of LLM Misgendering
por: Subramonian, Arjun, et al.
Publicado: (2025)

Demo: TOSense -- What Did You Just Agree to?
por: Chen, Xinzhang, et al.
Publicado: (2025)

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
por: Baheti, Ashutosh, et al.
Publicado: (2023)

Word Synchronization Challenge: A Benchmark for Word Association Responses for Large Language Models
por: Cazalets, Tanguy, et al.
Publicado: (2025)

Trust by Design: Skill Profiles for Transparent, Cost-Aware LLM Routing
por: Okamoto, Mika, et al.
Publicado: (2026)

Evaluating Creative Short Story Generation in Humans and Large Language Models
por: Ismayilzada, Mete, et al.
Publicado: (2024)

Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models
por: Ataee, Shabnam, et al.
Publicado: (2025)

Language classrooms as communicative settings for learners’ development of sociolinguistic competence during study abroad
por: Devin Grammon
Publicado: (2025)

Where Do People Tell Stories Online? Story Detection Across Online Communities
por: Antoniak, Maria, et al.
Publicado: (2023)

Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?
por: Chingacham, Anupama, et al.
Publicado: (2024)

A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds
por: Cui, Christopher Z., et al.
Publicado: (2024)

Do Language Models Know Theo Has a Wife? Investigating the Proviso Problem
por: Azin, Tara, et al.
Publicado: (2026)

Revisiting the syntax of imperatives in Yemeni Arabic: An Agree across phases approach
por: Shormani, Mohammed Q.
Publicado: (2026)

ImmunoFOMO: Are Language Models missing what oncologists see?
por: Sinha, Aman, et al.
Publicado: (2025)

Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm
por: Babarczy, Anna, et al.
Publicado: (2026)

Do Large Language Models Judge Error Severity Like Humans?
por: Sun, Diege, et al.
Publicado: (2025)

Do Language Models Exhibit Human-like Structural Priming Effects?
por: Jumelet, Jaap, et al.
Publicado: (2024)

When Do Language Models Endorse Limitations on Human Rights Principles?
por: Samway, Keenan, et al.
Publicado: (2026)

We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for Tourism
por: Priya, Priyanshu, et al.
Publicado: (2025)

Do Two AI Scientists Agree?
por: Fu, Xinghong, et al.
Publicado: (2025)

Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
por: Subbiah, Melanie, et al.
Publicado: (2024)

Sensación y percepción / Margaret W. Matlin, Hugh J. Foley ; traducción de Marcela Ramírez Escoto
por: Matlin, Margaret W

Multilingual TinyStories: A Synthetic Combinatorial Corpus of Indic Children's Stories for Training Small Language Models
por: Halder, Deepon, et al.
Publicado: (2026)