:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Stranges, Nicholas, Yang, Yimin
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.04429
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The Outputs of Large Language Models are Meaningless
by: Hattiangadi, Anandi, et al.
Published: (2025)

Anthropomimetic Uncertainty: What Verbalized Uncertainty in Language Models is Missing
by: Ulmer, Dennis, et al.
Published: (2025)

Missed Connections: Lateral Thinking Puzzles for Large Language Models
by: Todd, Graham, et al.
Published: (2024)

Output Scouting: Auditing Large Language Models for Catastrophic Responses
by: Bell, Andrew, et al.
Published: (2024)

SLOT: Structuring the Output of Large Language Models
by: Wang, Darren Yow-Bang, et al.
Published: (2025)

Large Visual-Language Models Are Also Good Classifiers: A Study of In-Context Multimodal Fake News Detection
by: Jiang, Ye, et al.
Published: (2024)

What Single-Prompt Accuracy Misses: A Multi-Variant Reliability Audit of Language Models
by: Karmakar, Ranit, et al.
Published: (2026)

The Structured Output Benchmark: A Multi-Source Benchmark for Evaluating Structured Output Quality in Large Language Models
by: Singh, Abhinav Kumar, et al.
Published: (2026)

Instruction Tuning Vs. In-Context Learning: Revisiting Large Language Models in Few-Shot Computational Social Science
by: Wang, Taihang, et al.
Published: (2024)

UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities
by: Du, Dong, et al.
Published: (2025)

Detoxification of Large Language Models through Output-layer Fusion with a Calibration Model
by: Tian, Yuanhe, et al.
Published: (2025)

An Evaluation on Large Language Model Outputs: Discourse and Memorization
by: de Wynter, Adrian, et al.
Published: (2023)

Can AI Master Construction Management (CM)? Benchmarking State-of-the-Art Large Language Models on CM Certification Exams
by: Xiong, Ruoxin, et al.
Published: (2025)

Mechanistic Interpretability of Emotion Inference in Large Language Models
by: Tak, Ala N., et al.
Published: (2025)

Using Large Language Models for the Interpretation of Building Regulations
by: Fuchs, Stefan, et al.
Published: (2024)

What is the best model? Application-driven Evaluation for Large Language Models
by: Lian, Shiguo, et al.
Published: (2024)

Binary Autoencoder for Mechanistic Interpretability of Large Language Models
by: Cho, Hakaze, et al.
Published: (2025)

Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models
by: Ma, Yuxi, et al.
Published: (2023)

Do Large Language Models Know What They Are Capable Of?
by: Barkan, Casey O., et al.
Published: (2025)

Convergence of Outputs When Two Large Language Models Interact in a Multi-Agentic Setup
by: Maiti, Aniruddha, et al.
Published: (2025)

Mind the Gap: Conformative Decoding to Improve Output Diversity of Instruction-Tuned Large Language Models
by: Peeperkorn, Max, et al.
Published: (2025)

StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models
by: Bi, Baolong, et al.
Published: (2024)

Comparing Human and Large Language Model Interpretation of Implicit Information
by: De Santis, Antonio, et al.
Published: (2026)

Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning
by: Luo, Linhao, et al.
Published: (2023)

Interpretable Differential Diagnosis with Dual-Inference Large Language Models
by: Zhou, Shuang, et al.
Published: (2024)

Group-Aware Reinforcement Learning for Output Diversity in Large Language Models
by: Anschel, Oron, et al.
Published: (2025)

Fine-Grained Interpretation of Political Opinions in Large Language Models
by: Hu, Jingyu, et al.
Published: (2025)

CrashSage: A Large Language Model-Centered Framework for Contextual and Interpretable Traffic Crash Analysis
by: Zhen, Hao, et al.
Published: (2025)

DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation
by: Sun, Jiashuo, et al.
Published: (2025)

Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Model
by: Wang, Siyin, et al.
Published: (2024)

Phase Transitions in the Output Distribution of Large Language Models
by: Arnold, Julian, et al.
Published: (2024)

Team QUST at SemEval-2025 Task 10: Evaluating Large Language Models in Multiclass Multi-label Classification of News Entity Framing
by: Liu, Jiyan, et al.
Published: (2025)

LLMD: A Large Language Model for Interpreting Longitudinal Medical Records
by: Porter, Robert, et al.
Published: (2024)

Rethinking Interpretability in the Era of Large Language Models
by: Singh, Chandan, et al.
Published: (2024)

Large Language Model in Financial Regulatory Interpretation
by: Cao, Zhiyu, et al.
Published: (2024)

"Sorry, I Didn't Catch That": How Speech Models Miss What Matters Most
by: Zhou, Kaitlyn, et al.
Published: (2026)

AdapTime: Enabling Adaptive Temporal Reasoning in Large Language Models
by: Deng, Yimin, et al.
Published: (2026)

Round-Trip Translation Reveals What Frontier Multilingual Benchmarks Miss
by: Skorobogat, Ronald, et al.
Published: (2026)

Regulating Large Language Models: A Roundtable Report
by: Nicholas, Gabriel, et al.
Published: (2024)

What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis
by: Wang, Peiran, et al.
Published: (2025)