Saved in:
| Main Authors: | Davies, Adam, Jiang, Jize, Zhai, ChengXiang |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2303.00333 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
by: Yang, Ke, et al.
Published: (2024)
by: Yang, Ke, et al.
Published: (2024)
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
by: Hao, Yuren, et al.
Published: (2025)
by: Hao, Yuren, et al.
Published: (2025)
Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency
by: Liu, Yiran, et al.
Published: (2024)
by: Liu, Yiran, et al.
Published: (2024)
PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
by: Yang, Ke, et al.
Published: (2026)
by: Yang, Ke, et al.
Published: (2026)
Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation
by: Sun, Chenkai, et al.
Published: (2025)
by: Sun, Chenkai, et al.
Published: (2025)
User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
by: Hao, Yuren, et al.
Published: (2026)
by: Hao, Yuren, et al.
Published: (2026)
User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation
by: Balog, Krisztian, et al.
Published: (2025)
by: Balog, Krisztian, et al.
Published: (2025)
User Simulation for Evaluating Information Access Systems
by: Balog, Krisztian, et al.
Published: (2023)
by: Balog, Krisztian, et al.
Published: (2023)
Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement
by: Sun, Chenkai, et al.
Published: (2024)
by: Sun, Chenkai, et al.
Published: (2024)
Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains
by: Zhang, Yu, et al.
Published: (2024)
by: Zhang, Yu, et al.
Published: (2024)
Globally Optimal Training of Spiking Neural Networks via Parameter Reconstruction
by: Udupi, Himanshu, et al.
Published: (2026)
by: Udupi, Himanshu, et al.
Published: (2026)
Benchmarking Multi-turn Medical Diagnosis: Hold, Lure, and Self-Correction
by: Fang, Jinrui, et al.
Published: (2026)
by: Fang, Jinrui, et al.
Published: (2026)
Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models
by: Merlo, Paola, et al.
Published: (2026)
by: Merlo, Paola, et al.
Published: (2026)
Ten Principles of AI Agent Economics
by: Yang, Ke, et al.
Published: (2025)
by: Yang, Ke, et al.
Published: (2025)
The Indispensable Role of User Simulation in the Pursuit of AGI
by: Balog, Krisztian, et al.
Published: (2025)
by: Balog, Krisztian, et al.
Published: (2025)
An Axiomatic Benchmark for Evaluation of Scientific Novelty Metrics
by: Liu, Miri, et al.
Published: (2026)
by: Liu, Miri, et al.
Published: (2026)
Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm
by: Xie, Yuanzhen, et al.
Published: (2024)
by: Xie, Yuanzhen, et al.
Published: (2024)
MentalGLM Series: Explainable Large Language Models for Mental Health Analysis on Chinese Social Media
by: Zhai, Wei, et al.
Published: (2024)
by: Zhai, Wei, et al.
Published: (2024)
Evaluating the Deductive Competence of Large Language Models
by: Seals, Spencer M., et al.
Published: (2023)
by: Seals, Spencer M., et al.
Published: (2023)
Traces of Social Competence in Large Language Models
by: Kouwenhoven, Tom, et al.
Published: (2026)
by: Kouwenhoven, Tom, et al.
Published: (2026)
Pragmatic Competence Evaluation of Large Language Models for the Korean Language
by: Park, Dojun, et al.
Published: (2024)
by: Park, Dojun, et al.
Published: (2024)
Make Any Collection Navigable: Methods for Constructing and Evaluating Hypergraph of Text
by: Alvarez, Dean E., et al.
Published: (2026)
by: Alvarez, Dean E., et al.
Published: (2026)
Extrinsic Evaluation of Cultural Competence in Large Language Models
by: Bhatt, Shaily, et al.
Published: (2024)
by: Bhatt, Shaily, et al.
Published: (2024)
Benchmarking Motivational Interviewing Competence of Large Language Models
by: Jha, Aishwariya, et al.
Published: (2026)
by: Jha, Aishwariya, et al.
Published: (2026)
Evaluation of Cultural Competence of Vision-Language Models
by: Yadav, Srishti, et al.
Published: (2025)
by: Yadav, Srishti, et al.
Published: (2025)
Exploring a New Competency Modeling Process with Large Language Models
by: Du, Silin, et al.
Published: (2026)
by: Du, Silin, et al.
Published: (2026)
Vision-Language Models Are Not Pragmatically Competent in Referring Expression Generation
by: Ma, Ziqiao, et al.
Published: (2025)
by: Ma, Ziqiao, et al.
Published: (2025)
Holmes: A Benchmark to Assess the Linguistic Competence of Language Models
by: Waldis, Andreas, et al.
Published: (2024)
by: Waldis, Andreas, et al.
Published: (2024)
Flash Interpretability: Decoding Specialised Feature Neurons in Large Language Models with the LM-Head
by: Davies, Harry J
Published: (2025)
by: Davies, Harry J
Published: (2025)
Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks
by: Yamaguchi, Atsuki, et al.
Published: (2026)
by: Yamaguchi, Atsuki, et al.
Published: (2026)
Large Language Models as Neurolinguistic Subjects: Discrepancy between Performance and Competence
by: He, Linyang, et al.
Published: (2024)
by: He, Linyang, et al.
Published: (2024)
Scaling Competence, Shrinking Reasoning: Cognitive Signatures in Language Model Learning
by: Singh, Mukul, et al.
Published: (2025)
by: Singh, Mukul, et al.
Published: (2025)
Unveiling Language Competence Neurons: A Psycholinguistic Approach to Model Interpretability
by: Duan, Xufeng, et al.
Published: (2024)
by: Duan, Xufeng, et al.
Published: (2024)
SAIL: Sample-Centric In-Context Learning for Document Information Extraction
by: Zhang, Jinyu, et al.
Published: (2024)
by: Zhang, Jinyu, et al.
Published: (2024)
A Novel Psychometrics-Based Approach to Developing Professional Competency Benchmark for Large Language Models
by: Kardanova, Elena, et al.
Published: (2024)
by: Kardanova, Elena, et al.
Published: (2024)
Interactive Information Need Prediction with Intent and Context
by: Ros, Kevin, et al.
Published: (2025)
by: Ros, Kevin, et al.
Published: (2025)
A Lightweight Large Language Model-Based Multi-Agent System for 2D Frame Structural Analysis
by: Geng, Ziheng, et al.
Published: (2025)
by: Geng, Ziheng, et al.
Published: (2025)
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark
by: Li, Zheqing, et al.
Published: (2025)
by: Li, Zheqing, et al.
Published: (2025)
The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
by: Yu, Kefan, et al.
Published: (2025)
by: Yu, Kefan, et al.
Published: (2025)
Foundation Models for Low-Resource Language Education (Vision Paper)
by: Ding, Zhaojun, et al.
Published: (2024)
by: Ding, Zhaojun, et al.
Published: (2024)
Similar Items
-
TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment
by: Yang, Ke, et al.
Published: (2024) -
An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems
by: Hao, Yuren, et al.
Published: (2025) -
Bias and Volatility: A Statistical Framework for Evaluating Large Language Model's Stereotypes and the Associated Generation Inconsistency
by: Liu, Yiran, et al.
Published: (2024) -
PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents
by: Yang, Ke, et al.
Published: (2026) -
Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation
by: Sun, Chenkai, et al.
Published: (2025)