Guardado en:
| Autores principales: | Matlin, Glenn, Zhang, Devin, Loza, Rodrigo Barroso, Popescu, Diana M., Isbell, Joni, Chakraborty, Chandreyi, Riedl, Mark |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2508.15794 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Creating Suspenseful Stories: Iterative Planning with Large Language Models
por: Xie, Kaige, et al.
Publicado: (2024)
por: Xie, Kaige, et al.
Publicado: (2024)
On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks
por: Gupta, Aarav, et al.
Publicado: (2026)
por: Gupta, Aarav, et al.
Publicado: (2026)
Finance Language Model Evaluation (FLaME)
por: Matlin, Glenn, et al.
Publicado: (2025)
por: Matlin, Glenn, et al.
Publicado: (2025)
Agree to Agree
Publicado: (2020)
Publicado: (2020)
An Update to Methods Showcase Articles in Language Learning
por: Daniel R. Isbell
Publicado: (2026)
por: Daniel R. Isbell
Publicado: (2026)
Do LLMs Agree on the Creativity Evaluation of Alternative Uses?
por: Rabeyah, Abdullah Al, et al.
Publicado: (2024)
por: Rabeyah, Abdullah Al, et al.
Publicado: (2024)
Financial Instruction Following Evaluation (FIFE)
por: Matlin, Glenn, et al.
Publicado: (2025)
por: Matlin, Glenn, et al.
Publicado: (2025)
Agreeing to Interact in Human-Robot Interaction using Large Language Models and Vision Language Models
por: Sasabuchi, Kazuhiro, et al.
Publicado: (2025)
por: Sasabuchi, Kazuhiro, et al.
Publicado: (2025)
Do These LLM Benchmarks Agree? Fixing Benchmark Evaluation with BenchBench
por: Perlitz, Yotam, et al.
Publicado: (2024)
por: Perlitz, Yotam, et al.
Publicado: (2024)
Do Language Models Enjoy Their Own Stories? Prompting Large Language Models for Automatic Story Evaluation
por: Chhun, Cyril, et al.
Publicado: (2024)
por: Chhun, Cyril, et al.
Publicado: (2024)
Making Large Language Models into World Models with Precondition and Effect Knowledge
por: Xie, Kaige, et al.
Publicado: (2024)
por: Xie, Kaige, et al.
Publicado: (2024)
Conversational Agents and the Understanding of Human Language: Reflections on AI, LLMs, and Cognitive Science
por: Popescu-Belis, Andrei
Publicado: (2026)
por: Popescu-Belis, Andrei
Publicado: (2026)
Shall We Play a Game? Language Models for Open-ended Wargames
por: Matlin, Glenn, et al.
Publicado: (2025)
por: Matlin, Glenn, et al.
Publicado: (2025)
Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations
por: Hong, Pingjun, et al.
Publicado: (2025)
por: Hong, Pingjun, et al.
Publicado: (2025)
AgreeMate: Teaching LLMs to Haggle
por: Chatterjee, Ainesh, et al.
Publicado: (2024)
por: Chatterjee, Ainesh, et al.
Publicado: (2024)
Perceptions of Linguistic Uncertainty by Language Models and Humans
por: Belem, Catarina G, et al.
Publicado: (2024)
por: Belem, Catarina G, et al.
Publicado: (2024)
When Annotators Agree but Labels Disagree: The Projection Problem in Stance Detection
por: Zhang, Bowen
Publicado: (2026)
por: Zhang, Bowen
Publicado: (2026)
Agree to Disagree? A Meta-Evaluation of LLM Misgendering
por: Subramonian, Arjun, et al.
Publicado: (2025)
por: Subramonian, Arjun, et al.
Publicado: (2025)
Demo: TOSense -- What Did You Just Agree to?
por: Chen, Xinzhang, et al.
Publicado: (2025)
por: Chen, Xinzhang, et al.
Publicado: (2025)
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
por: Baheti, Ashutosh, et al.
Publicado: (2023)
por: Baheti, Ashutosh, et al.
Publicado: (2023)
Word Synchronization Challenge: A Benchmark for Word Association Responses for Large Language Models
por: Cazalets, Tanguy, et al.
Publicado: (2025)
por: Cazalets, Tanguy, et al.
Publicado: (2025)
Trust by Design: Skill Profiles for Transparent, Cost-Aware LLM Routing
por: Okamoto, Mika, et al.
Publicado: (2026)
por: Okamoto, Mika, et al.
Publicado: (2026)
Evaluating Creative Short Story Generation in Humans and Large Language Models
por: Ismayilzada, Mete, et al.
Publicado: (2024)
por: Ismayilzada, Mete, et al.
Publicado: (2024)
Chain-of-Thought Reasoning Improves Context-Aware Translation with Large Language Models
por: Ataee, Shabnam, et al.
Publicado: (2025)
por: Ataee, Shabnam, et al.
Publicado: (2025)
Language classrooms as communicative settings for learners’ development of sociolinguistic competence during study abroad
por: Devin Grammon
Publicado: (2025)
por: Devin Grammon
Publicado: (2025)
Where Do People Tell Stories Online? Story Detection Across Online Communities
por: Antoniak, Maria, et al.
Publicado: (2023)
por: Antoniak, Maria, et al.
Publicado: (2023)
Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?
por: Chingacham, Anupama, et al.
Publicado: (2024)
por: Chingacham, Anupama, et al.
Publicado: (2024)
A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds
por: Cui, Christopher Z., et al.
Publicado: (2024)
por: Cui, Christopher Z., et al.
Publicado: (2024)
Do Language Models Know Theo Has a Wife? Investigating the Proviso Problem
por: Azin, Tara, et al.
Publicado: (2026)
por: Azin, Tara, et al.
Publicado: (2026)
Revisiting the syntax of imperatives in Yemeni Arabic: An Agree across phases approach
por: Shormani, Mohammed Q.
Publicado: (2026)
por: Shormani, Mohammed Q.
Publicado: (2026)
ImmunoFOMO: Are Language Models missing what oncologists see?
por: Sinha, Aman, et al.
Publicado: (2025)
por: Sinha, Aman, et al.
Publicado: (2025)
Do Large Language Models Possess a Theory of Mind? A Comparative Evaluation Using the Strange Stories Paradigm
por: Babarczy, Anna, et al.
Publicado: (2026)
por: Babarczy, Anna, et al.
Publicado: (2026)
Do Large Language Models Judge Error Severity Like Humans?
por: Sun, Diege, et al.
Publicado: (2025)
por: Sun, Diege, et al.
Publicado: (2025)
Do Language Models Exhibit Human-like Structural Priming Effects?
por: Jumelet, Jaap, et al.
Publicado: (2024)
por: Jumelet, Jaap, et al.
Publicado: (2024)
When Do Language Models Endorse Limitations on Human Rights Principles?
por: Samway, Keenan, et al.
Publicado: (2026)
por: Samway, Keenan, et al.
Publicado: (2026)
We Argue to Agree: Towards Personality-Driven Argumentation-Based Negotiation Dialogue Systems for Tourism
por: Priya, Priyanshu, et al.
Publicado: (2025)
por: Priya, Priyanshu, et al.
Publicado: (2025)
Do Two AI Scientists Agree?
por: Fu, Xinghong, et al.
Publicado: (2025)
por: Fu, Xinghong, et al.
Publicado: (2025)
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
por: Subbiah, Melanie, et al.
Publicado: (2024)
por: Subbiah, Melanie, et al.
Publicado: (2024)
Sensación y percepción / Margaret W. Matlin, Hugh J. Foley ; traducción de Marcela Ramírez Escoto
por: Matlin, Margaret W
por: Matlin, Margaret W
Multilingual TinyStories: A Synthetic Combinatorial Corpus of Indic Children's Stories for Training Small Language Models
por: Halder, Deepon, et al.
Publicado: (2026)
por: Halder, Deepon, et al.
Publicado: (2026)
Ejemplares similares
-
Creating Suspenseful Stories: Iterative Planning with Large Language Models
por: Xie, Kaige, et al.
Publicado: (2024) -
On the Quantization Robustness of Diffusion Language Models in Coding Benchmarks
por: Gupta, Aarav, et al.
Publicado: (2026) -
Finance Language Model Evaluation (FLaME)
por: Matlin, Glenn, et al.
Publicado: (2025) -
Agree to Agree
Publicado: (2020) -
An Update to Methods Showcase Articles in Language Learning
por: Daniel R. Isbell
Publicado: (2026)