Saved in:
| Main Authors: | Ivey, Jonathan, Field, Anjalie, Xiao, Ziang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.05163 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Controlled Generation for Private Synthetic Text
by: Zhao, Zihao, et al.
Published: (2025)
by: Zhao, Zihao, et al.
Published: (2025)
HICode: Hierarchical Inductive Coding with LLMs
by: Zhong, Mian, et al.
Published: (2025)
by: Zhong, Mian, et al.
Published: (2025)
Generative Personality Simulation via Theory-Informed Structured Interview
by: Wang, Pengda, et al.
Published: (2025)
by: Wang, Pengda, et al.
Published: (2025)
What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
by: Ki, Dayeon, et al.
Published: (2026)
by: Ki, Dayeon, et al.
Published: (2026)
What Makes a Reward Model a Good Teacher? An Optimization Perspective
by: Razin, Noam, et al.
Published: (2025)
by: Razin, Noam, et al.
Published: (2025)
What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance
by: Watson, William, et al.
Published: (2026)
by: Watson, William, et al.
Published: (2026)
The Effect of Document Selection on Query-focused Text Analysis
by: Rangreji, Sandesh S, et al.
Published: (2026)
by: Rangreji, Sandesh S, et al.
Published: (2026)
Gender Bias in LLM-generated Interview Responses
by: Kong, Haein, et al.
Published: (2024)
by: Kong, Haein, et al.
Published: (2024)
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
by: Liu, Wei, et al.
Published: (2023)
by: Liu, Wei, et al.
Published: (2023)
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study
by: Fan, Xiaoran, et al.
Published: (2025)
by: Fan, Xiaoran, et al.
Published: (2025)
Does Local News Stay Local?: Online Content Shifts in Sinclair-Acquired Stations
by: Wanner, Miriam, et al.
Published: (2025)
by: Wanner, Miriam, et al.
Published: (2025)
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews
by: Spangher, Alexander, et al.
Published: (2024)
by: Spangher, Alexander, et al.
Published: (2024)
What Makes In-context Learning Effective for Mathematical Reasoning: A Theoretical Analysis
by: Liu, Jiayu, et al.
Published: (2024)
by: Liu, Jiayu, et al.
Published: (2024)
Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity
by: Amirova, Aliya, et al.
Published: (2023)
by: Amirova, Aliya, et al.
Published: (2023)
Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator
by: Kirstein, Frederic, et al.
Published: (2024)
by: Kirstein, Frederic, et al.
Published: (2024)
Word Clouds as Common Voices: LLM-Assisted Visualization of Participant-Weighted Themes in Qualitative Interviews
by: Colonel, Joseph T., et al.
Published: (2025)
by: Colonel, Joseph T., et al.
Published: (2025)
What Makes Cryptic Crosswords Challenging for LLMs?
by: Sadallah, Abdelrahman, et al.
Published: (2024)
by: Sadallah, Abdelrahman, et al.
Published: (2024)
What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks
by: Kirch, Nathalie, et al.
Published: (2024)
by: Kirch, Nathalie, et al.
Published: (2024)
Faux Polyglot: A Study on Information Disparity in Multilingual Large Language Models
by: Sharma, Nikhil, et al.
Published: (2024)
by: Sharma, Nikhil, et al.
Published: (2024)
Empirical Analysis of Decoding Biases in Masked Diffusion Models
by: Huang, Pengcheng, et al.
Published: (2025)
by: Huang, Pengcheng, et al.
Published: (2025)
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP
by: Papi, Sara, et al.
Published: (2023)
by: Papi, Sara, et al.
Published: (2023)
Context Over Compute Human-in-the-Loop Outperforms Iterative Chain-of-Thought Prompting in Interview Answer Quality
by: Zhu, Kewen, et al.
Published: (2026)
by: Zhu, Kewen, et al.
Published: (2026)
Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking
by: Sharma, Nikhil, et al.
Published: (2024)
by: Sharma, Nikhil, et al.
Published: (2024)
What Defines Good Reasoning in LLMs? Dissecting Reasoning Steps with Multi-Aspect Evaluation
by: Do, Heejin, et al.
Published: (2025)
by: Do, Heejin, et al.
Published: (2025)
What Makes an Ideal Quote? Recommending "Unexpected yet Rational" Quotations via Novelty
by: Zhang, Bowei, et al.
Published: (2025)
by: Zhang, Bowei, et al.
Published: (2025)
GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses
by: Mun, Jimin, et al.
Published: (2026)
by: Mun, Jimin, et al.
Published: (2026)
Learning to Self-Verify Makes Language Models Better Reasoners
by: Chen, Yuxin, et al.
Published: (2026)
by: Chen, Yuxin, et al.
Published: (2026)
Rethinking the Alignment of Psychotherapy Dialogue Generation with Motivational Interviewing Strategies
by: Sun, Xin, et al.
Published: (2024)
by: Sun, Xin, et al.
Published: (2024)
InterviewSim: A Scalable Framework for Interview-Grounded Personality Simulation
by: Li, Yu, et al.
Published: (2026)
by: Li, Yu, et al.
Published: (2026)
Poser: Unmasking Alignment Faking LLMs by Manipulating Their Internals
by: Clymer, Joshua, et al.
Published: (2024)
by: Clymer, Joshua, et al.
Published: (2024)
What Level of Automation is "Good Enough"? A Benchmark of Large Language Models for Meta-Analysis Data Extraction
by: Li, Lingbo, et al.
Published: (2025)
by: Li, Lingbo, et al.
Published: (2025)
An Empirical Analysis of Diversity in Argument Summarization
by: van der Meer, Michiel, et al.
Published: (2024)
by: van der Meer, Michiel, et al.
Published: (2024)
The Incomplete Bridge: How AI Research (Mis)Engages with Psychology
by: Jiang, Han, et al.
Published: (2025)
by: Jiang, Han, et al.
Published: (2025)
AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators
by: Mazumder, Aritra, et al.
Published: (2026)
by: Mazumder, Aritra, et al.
Published: (2026)
AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers
by: Wuttke, Alexander, et al.
Published: (2024)
by: Wuttke, Alexander, et al.
Published: (2024)
What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks
by: Huang, Chengrui, et al.
Published: (2024)
by: Huang, Chengrui, et al.
Published: (2024)
TALES: Text Adventure Learning Environment Suite
by: Cui, Christopher Zhang, et al.
Published: (2025)
by: Cui, Christopher Zhang, et al.
Published: (2025)
Automated Evaluation can Distinguish the Good and Bad AI Responses to Patient Questions about Hospitalization
by: Soni, Sarvesh, et al.
Published: (2025)
by: Soni, Sarvesh, et al.
Published: (2025)
Unrequited Emotions: Investigating the Gaps in Motivation and Practice in Speech Emotion Recognition Research
by: Wong, Taryn, et al.
Published: (2026)
by: Wong, Taryn, et al.
Published: (2026)
How do we measure privacy in text? A survey of text anonymization metrics
by: Ren, Yaxuan, et al.
Published: (2025)
by: Ren, Yaxuan, et al.
Published: (2025)
Similar Items
-
Controlled Generation for Private Synthetic Text
by: Zhao, Zihao, et al.
Published: (2025) -
HICode: Hierarchical Inductive Coding with LLMs
by: Zhong, Mian, et al.
Published: (2025) -
Generative Personality Simulation via Theory-Informed Structured Interview
by: Wang, Pengda, et al.
Published: (2025) -
What Makes Good Multilingual Reasoning? Disentangling Reasoning Traces with Measurable Features
by: Ki, Dayeon, et al.
Published: (2026) -
What Makes a Reward Model a Good Teacher? An Optimization Perspective
by: Razin, Noam, et al.
Published: (2025)