Saved in:
| Main Authors: | Halterman, Andrew, Keith, Katherine A. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.10747 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
by: Halterman, Andrew, et al.
Published: (2025)
by: Halterman, Andrew, et al.
Published: (2025)
Synthetically generated text for supervised text analysis
by: Halterman, Andrew
Published: (2023)
by: Halterman, Andrew
Published: (2023)
CURP: Codebook-based Continuous User Representation for Personalized Generation with LLMs
by: Wang, Liang, et al.
Published: (2026)
by: Wang, Liang, et al.
Published: (2026)
Measuring Scalar Constructs in Social Science with LLMs
by: Licht, Hauke, et al.
Published: (2025)
by: Licht, Hauke, et al.
Published: (2025)
Benchmarking LLMs for Political Science: A United Nations Perspective
by: Liang, Yueqing, et al.
Published: (2025)
by: Liang, Yueqing, et al.
Published: (2025)
The Political Preferences of LLMs
by: Rozado, David
Published: (2024)
by: Rozado, David
Published: (2024)
Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts
by: Asthana, Sumit, et al.
Published: (2024)
by: Asthana, Sumit, et al.
Published: (2024)
Systematic Evaluation of Long-Context LLMs on Financial Concepts
by: Gupta, Lavanya, et al.
Published: (2024)
by: Gupta, Lavanya, et al.
Published: (2024)
Teaching LLMs to Refine with Tools
by: Yu, Dian, et al.
Published: (2024)
by: Yu, Dian, et al.
Published: (2024)
Concept Space Alignment in Multilingual LLMs
by: Peng, Qiwei, et al.
Published: (2024)
by: Peng, Qiwei, et al.
Published: (2024)
Concept Attractors in LLMs and their Applications
by: Chytas, Sotirios Panagiotis, et al.
Published: (2025)
by: Chytas, Sotirios Panagiotis, et al.
Published: (2025)
Evaluating the Evaluator: Measuring LLMs' Adherence to Task Evaluation Instructions
by: Murugadoss, Bhuvanashree, et al.
Published: (2024)
by: Murugadoss, Bhuvanashree, et al.
Published: (2024)
Evaluating Personalized Tool-Augmented LLMs from the Perspectives of Personalization and Proactivity
by: Hao, Yupu, et al.
Published: (2025)
by: Hao, Yupu, et al.
Published: (2025)
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions
by: Rakotonirina, Nathanaël Carraz, et al.
Published: (2025)
by: Rakotonirina, Nathanaël Carraz, et al.
Published: (2025)
Leveraging In-Context Learning for Political Bias Testing of LLMs
by: Haller, Patrick, et al.
Published: (2025)
by: Haller, Patrick, et al.
Published: (2025)
Incivility and Rigidity: Evaluating the Risks of Fine-Tuning LLMs for Political Argumentation
by: Churina, Svetlana, et al.
Published: (2024)
by: Churina, Svetlana, et al.
Published: (2024)
Bias Beyond Borders: Political Ideology Evaluation and Steering in Multilingual LLMs
by: Nadeem, Afrozah, et al.
Published: (2026)
by: Nadeem, Afrozah, et al.
Published: (2026)
Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators
by: Bajpai, Prasoon, et al.
Published: (2024)
by: Bajpai, Prasoon, et al.
Published: (2024)
LLMs vs. Traditional Sentiment Tools in Psychology: An Evaluation on Belgian-Dutch Narratives
by: Kandala, Ratna, et al.
Published: (2025)
by: Kandala, Ratna, et al.
Published: (2025)
Tool Unlearning for Tool-Augmented LLMs
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs
by: Li, Yijie, et al.
Published: (2024)
by: Li, Yijie, et al.
Published: (2024)
Measuring Teaching with LLMs
by: Hardy, Michael
Published: (2025)
by: Hardy, Michael
Published: (2025)
Funny or Persuasive, but Not Both: Evaluating Fine-Grained Multi-Concept Control in LLMs
by: Labroo, Arya, et al.
Published: (2026)
by: Labroo, Arya, et al.
Published: (2026)
Hidden Persuaders: LLMs' Political Leaning and Their Influence on Voters
by: Potter, Yujin, et al.
Published: (2024)
by: Potter, Yujin, et al.
Published: (2024)
ToolGate: Contract-Grounded and Verified Tool Execution for LLMs
by: Liu, Yanming, et al.
Published: (2026)
by: Liu, Yanming, et al.
Published: (2026)
Using Generative Text Models to Create Qualitative Codebooks for Student Evaluations of Teaching
by: Katz, Andrew, et al.
Published: (2024)
by: Katz, Andrew, et al.
Published: (2024)
Unsupervised Concept Vector Extraction for Bias Control in LLMs
by: Cyberey, Hannah, et al.
Published: (2025)
by: Cyberey, Hannah, et al.
Published: (2025)
Quality Matters: Evaluating Synthetic Data for Tool-Using LLMs
by: Iskander, Shadi, et al.
Published: (2024)
by: Iskander, Shadi, et al.
Published: (2024)
Automate Knowledge Concept Tagging on Math Questions with LLMs
by: Li, Hang, et al.
Published: (2024)
by: Li, Hang, et al.
Published: (2024)
What are They Thinking? Delineation, Probing and Tracking of Concepts in LLMs
by: Abdelwahab, Mohamed, et al.
Published: (2026)
by: Abdelwahab, Mohamed, et al.
Published: (2026)
Beyond Tokens: Concept-Level Training Objectives for LLMs
by: Iyer, Laya, et al.
Published: (2026)
by: Iyer, Laya, et al.
Published: (2026)
Benchmarking Concept-Spilling Across Languages in LLMs
by: Badanin, Ilia, et al.
Published: (2026)
by: Badanin, Ilia, et al.
Published: (2026)
Political Events using RAG with LLMs
by: Arslan, Muhammad, et al.
Published: (2025)
by: Arslan, Muhammad, et al.
Published: (2025)
Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications
by: Dong, Wenhan, et al.
Published: (2025)
by: Dong, Wenhan, et al.
Published: (2025)
Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs
by: Gao, Lang, et al.
Published: (2025)
by: Gao, Lang, et al.
Published: (2025)
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
by: Feng, Jiazhan, et al.
Published: (2025)
by: Feng, Jiazhan, et al.
Published: (2025)
SubData: Bridging Heterogeneous Datasets to Enable Theory-Driven Evaluation of Political and Demographic Perspectives in LLMs
by: Bernardelle, Pietro, et al.
Published: (2024)
by: Bernardelle, Pietro, et al.
Published: (2024)
DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science
by: Shu, Fan, et al.
Published: (2026)
by: Shu, Fan, et al.
Published: (2026)
CUTE: Measuring LLMs' Understanding of Their Tokens
by: Edman, Lukas, et al.
Published: (2024)
by: Edman, Lukas, et al.
Published: (2024)
Leveraging LLMs for Dialogue Quality Measurement
by: Jia, Jinghan, et al.
Published: (2024)
by: Jia, Jinghan, et al.
Published: (2024)
Similar Items
-
What is a protest anyway? Codebook conceptualization is still a first-order concern in LLM-era classification
by: Halterman, Andrew, et al.
Published: (2025) -
Synthetically generated text for supervised text analysis
by: Halterman, Andrew
Published: (2023) -
CURP: Codebook-based Continuous User Representation for Personalized Generation with LLMs
by: Wang, Liang, et al.
Published: (2026) -
Measuring Scalar Constructs in Social Science with LLMs
by: Licht, Hauke, et al.
Published: (2025) -
Benchmarking LLMs for Political Science: A United Nations Perspective
by: Liang, Yueqing, et al.
Published: (2025)