:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Davies, Adam, Jiang, Jize, Zhai, ChengXiang
Format:	Preprint
Published:	2023
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2303.00333
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
by: Yang, Ke, et al.
Published: (2024)

An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
by: Hao, Yuren, et al.
Published: (2025)

Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency
by: Liu, Yiran, et al.
Published: (2024)

PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
by: Yang, Ke, et al.
Published: (2026)

Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation
by: Sun, Chenkai, et al.
Published: (2025)

User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
by: Hao, Yuren, et al.
Published: (2026)

User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation
by: Balog, Krisztian, et al.
Published: (2025)

User Simulation for Evaluating Information Access Systems
by: Balog, Krisztian, et al.
Published: (2023)

Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement
by: Sun, Chenkai, et al.
Published: (2024)

Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains
by: Zhang, Yu, et al.
Published: (2024)

Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction
by: Udupi, Himanshu, et al.
Published: (2026)

Benchmarking Multi-turn Medical Diagnosis: Hold, Lure, and Self-Correction
by: Fang, Jinrui, et al.
Published: (2026)

Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models
by: Merlo, Paola, et al.
Published: (2026)

Ten Principles of AI Agent Economics
by: Yang, Ke, et al.
Published: (2025)

The Indispensable Role of User Simulation in the Pursuit of AGI
by: Balog, Krisztian, et al.
Published: (2025)

An Axiomatic Benchmark for Evaluation of Scientific Novelty Metrics
by: Liu, Miri, et al.
Published: (2026)

Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm
by: Xie, Yuanzhen, et al.
Published: (2024)

MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media
by: Zhai, Wei, et al.
Published: (2024)

Evaluating the Deductive Competence of Large Language Models
by: Seals, Spencer M., et al.
Published: (2023)

Traces of Social Competence in Large Language Models
by: Kouwenhoven, Tom, et al.
Published: (2026)

Pragmatic Competence Evaluation of Large Language Models for the Korean Language
by: Park, Dojun, et al.
Published: (2024)

Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text
by: Alvarez, Dean E., et al.
Published: (2026)

Extrinsic Evaluation of Cultural Competence in Large Language Models
by: Bhatt, Shaily, et al.
Published: (2024)

Benchmarking Motivational Interviewing Competence of Large Language Models
by: Jha, Aishwariya, et al.
Published: (2026)

Evaluation of Cultural Competence of Vision-Language Models
by: Yadav, Srishti, et al.
Published: (2025)

Exploring a New Competency Modeling Process with Large Language Models
by: Du, Silin, et al.
Published: (2026)

Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
by: Ma, Ziqiao, et al.
Published: (2025)

Holmes: A Benchmark to Assess the Linguistic Competence of Language Models
by: Waldis, Andreas, et al.
Published: (2024)

Flash Interpretability: Decoding Specialised Feature Neurons in Large Language Models with the LM-Head
by: Davies, Harry J
Published: (2025)

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks
by: Yamaguchi, Atsuki, et al.
Published: (2026)

Large Language Models as Neurolinguistic Subjects: Discrepancy between Performance and Competence
by: He, Linyang, et al.
Published: (2024)

Scaling Competence, Shrinking Reasoning: Cognitive Signatures in Language Model Learning
by: Singh, Mukul, et al.
Published: (2025)

Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability
by: Duan, Xufeng, et al.
Published: (2024)

SAIL: Sample-Centric In-Context Learning for Document Information Extraction
by: Zhang, Jinyu, et al.
Published: (2024)

A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models
by: Kardanova, Elena, et al.
Published: (2024)

Interactive Information Need Prediction with Intent and Context
by: Ros, Kevin, et al.
Published: (2025)

A Lightweight Large Language Model-Based Multi-Agent System for 2D Frame Structural Analysis
by: Geng, Ziheng, et al.
Published: (2025)

Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark
by: Li, Zheqing, et al.
Published: (2025)

The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
by: Yu, Kefan, et al.
Published: (2025)

Foundation Models for Low-Resource Language Education (Vision Paper)
by: Ding, Zhaojun, et al.
Published: (2024)