:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ivey, Jonathan, Field, Anjalie, Xiao, Ziang
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2604.05163
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Controlled Generation for Private Synthetic Text
by: Zhao, Zihao, et al.
Published: (2025)

HICode: Hierarchical Inductive Coding with LLMs
by: Zhong, Mian, et al.
Published: (2025)

Generative Personality Simulation via Theory-Informed Structured Interview
by: Wang, Pengda, et al.
Published: (2025)

What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
by: Ki, Dayeon, et al.
Published: (2026)

What Makes a Reward Model a Good Teacher? An Optimization Perspective
by: Razin, Noam, et al.
Published: (2025)

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance
by: Watson, William, et al.
Published: (2026)

The Effect of Document Selection on Query-focused Text Analysis
by: Rangreji, Sandesh S, et al.
Published: (2026)

Gender Bias in LLM-generated Interview Responses
by: Kong, Haein, et al.
Published: (2024)

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
by: Liu, Wei, et al.
Published: (2023)

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study
by: Fan, Xiaoran, et al.
Published: (2025)

Does Local News Stay Local?: Online Content Shifts in Sinclair-Acquired Stations
by: Wanner, Miriam, et al.
Published: (2025)

NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews
by: Spangher, Alexander, et al.
Published: (2024)

What Makes In-context Learning Effective for Mathematical Reasoning: A Theoretical Analysis
by: Liu, Jiayu, et al.
Published: (2024)

Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity
by: Amirova, Aliya, et al.
Published: (2023)

Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator
by: Kirstein, Frederic, et al.
Published: (2024)

Word Clouds as Common Voices: LLM-Assisted Visualization of Participant-Weighted Themes in Qualitative Interviews
by: Colonel, Joseph T., et al.
Published: (2025)

What Makes Cryptic Crosswords Challenging for LLMs?
by: Sadallah, Abdelrahman, et al.
Published: (2024)

What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
by: Kirch, Nathalie, et al.
Published: (2024)

Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
by: Sharma, Nikhil, et al.
Published: (2024)

Empirical Analysis of Decoding Biases in Masked Diffusion Models
by: Huang, Pengcheng, et al.
Published: (2025)

When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP
by: Papi, Sara, et al.
Published: (2023)

Context Over Compute Human-in-the-Loop Outperforms Iterative Chain-of-Thought Prompting in Interview Answer Quality
by: Zhu, Kewen, et al.
Published: (2026)

Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking
by: Sharma, Nikhil, et al.
Published: (2024)

What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation
by: Do, Heejin, et al.
Published: (2025)

What Makes an Ideal Quote? Recommending "Unexpected yet Rational" Quotations via Novelty
by: Zhang, Bowei, et al.
Published: (2025)

GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses
by: Mun, Jimin, et al.
Published: (2026)

Learning to Self-Verify Makes Language Models Better Reasoners
by: Chen, Yuxin, et al.
Published: (2026)

Rethinking the Alignment of Psychotherapy Dialogue Generation with Motivational Interviewing Strategies
by: Sun, Xin, et al.
Published: (2024)

InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation
by: Li, Yu, et al.
Published: (2026)

Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals
by: Clymer, Joshua, et al.
Published: (2024)

What Level of Automation is "Good Enough"? A Benchmark of Large Language Models for Meta-Analysis Data Extraction
by: Li, Lingbo, et al.
Published: (2025)

An Empirical Analysis of Diversity in Argument Summarization
by: van der Meer, Michiel, et al.
Published: (2024)

The Incomplete Bridge: How AI Research (Mis)Engages with Psychology
by: Jiang, Han, et al.
Published: (2025)

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators
by: Mazumder, Aritra, et al.
Published: (2026)

AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers
by: Wuttke, Alexander, et al.
Published: (2024)

What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks
by: Huang, Chengrui, et al.
Published: (2024)

TALES: Text Adventure Learning Environment Suite
by: Cui, Christopher Zhang, et al.
Published: (2025)

Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
by: Soni, Sarvesh, et al.
Published: (2025)

Unrequited Emotions: Investigating the Gaps in Motivation and Practice in Speech Emotion Recognition Research
by: Wong, Taryn, et al.
Published: (2026)

How do we measure privacy in text? A survey of text anonymization metrics
by: Ren, Yaxuan, et al.
Published: (2025)