:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Halterman, Andrew, Keith, Katherine A.
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2407.10747
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
by: Halterman, Andrew, et al.
Published: (2025)

Synthetically generated text for supervised text analysis
by: Halterman, Andrew
Published: (2023)

CURP: Codebook-based Continuous User Representation for Personalized Generation with LLMs
by: Wang, Liang, et al.
Published: (2026)

Measuring Scalar Constructs in Social Science with LLMs
by: Licht, Hauke, et al.
Published: (2025)

Benchmarking LLMs for Political Science: A United Nations Perspective
by: Liang, Yueqing, et al.
Published: (2025)

The Political Preferences of LLMs
by: Rozado, David
Published: (2024)

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts
by: Asthana, Sumit, et al.
Published: (2024)

Systematic Evaluation of Long-Context LLMs on Financial Concepts
by: Gupta, Lavanya, et al.
Published: (2024)

Teaching LLMs to Refine with Tools
by: Yu, Dian, et al.
Published: (2024)

Concept Space Alignment in Multilingual LLMs
by: Peng, Qiwei, et al.
Published: (2024)

Concept Attractors in LLMs and their Applications
by: Chytas, Sotirios Panagiotis, et al.
Published: (2025)

Evaluating the Evaluator: Measuring LLMs' Adherence to Task Evaluation Instructions
by: Murugadoss, Bhuvanashree, et al.
Published: (2024)

Evaluating Personalized Tool-Augmented LLMs from the Perspectives of Personalization and Proactivity
by: Hao, Yupu, et al.
Published: (2025)

From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions
by: Rakotonirina, Nathanaël Carraz, et al.
Published: (2025)

Leveraging In-Context Learning for Political Bias Testing of LLMs
by: Haller, Patrick, et al.
Published: (2025)

Incivility and Rigidity: Evaluating the Risks of Fine-Tuning LLMs for Political Argumentation
by: Churina, Svetlana, et al.
Published: (2024)

Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs
by: Nadeem, Afrozah, et al.
Published: (2026)

Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators
by: Bajpai, Prasoon, et al.
Published: (2024)

LLMs vs. Traditional Sentiment Tools in Psychology: An Evaluation on Belgian-Dutch Narratives
by: Kandala, Ratna, et al.
Published: (2025)

Tool Unlearning for Tool-Augmented LLMs
by: Cheng, Jiali, et al.
Published: (2025)

EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs
by: Li, Yijie, et al.
Published: (2024)

Measuring Teaching with LLMs
by: Hardy, Michael
Published: (2025)

Funny or Persuasive, but Not Both: Evaluating Fine-Grained Multi-Concept Control in LLMs
by: Labroo, Arya, et al.
Published: (2026)

Hidden Persuaders: LLMs' Political Leaning and Their Influence on Voters
by: Potter, Yujin, et al.
Published: (2024)

ToolGate: Contract-Grounded and Verified Tool Execution for LLMs
by: Liu, Yanming, et al.
Published: (2026)

Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching
by: Katz, Andrew, et al.
Published: (2024)

Unsupervised Concept Vector Extraction for Bias Control in LLMs
by: Cyberey, Hannah, et al.
Published: (2025)

Quality Matters: Evaluating Synthetic Data for Tool-Using LLMs
by: Iskander, Shadi, et al.
Published: (2024)

Automate Knowledge Concept Tagging on Math Questions with LLMs
by: Li, Hang, et al.
Published: (2024)

What are They Thinking? Delineation, Probing and Tracking of Concepts in LLMs
by: Abdelwahab, Mohamed, et al.
Published: (2026)

Beyond Tokens: Concept-Level Training Objectives for LLMs
by: Iyer, Laya, et al.
Published: (2026)

Benchmarking Concept-Spilling Across Languages in LLMs
by: Badanin, Ilia, et al.
Published: (2026)

Political Events using RAG with LLMs
by: Arslan, Muhammad, et al.
Published: (2025)

Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
by: Dong, Wenhan, et al.
Published: (2025)

Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs
by: Gao, Lang, et al.
Published: (2025)

ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
by: Feng, Jiazhan, et al.
Published: (2025)

SubData: Bridging Heterogeneous Datasets to Enable Theory-Driven Evaluation of Political and Demographic Perspectives in LLMs
by: Bernardelle, Pietro, et al.
Published: (2024)

DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science
by: Shu, Fan, et al.
Published: (2026)

CUTE: Measuring LLMs' Understanding of Their Tokens
by: Edman, Lukas, et al.
Published: (2024)

Leveraging LLMs for Dialogue Quality Measurement
by: Jia, Jinghan, et al.
Published: (2024)