:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Hu, Jinpeng, Dong, Tengteng, Gang, Luo, Ma, Hui, Zou, Peng, Sun, Xiao, Guo, Dan, Yang, Xun, Wang, Meng
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2407.05721
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Psyche-R1: Towards Reliable Psychological LLMs through Unified Empathy, Expertise, and Reasoning
by: Dai, Chongyuan, et al.
Published: (2025)

Traits Run Deep: Enhancing Personality Assessment via Psychology-Guided LLM Representations and Multimodal Apparent Behaviors
by: Li, Jia, et al.
Published: (2025)

AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment
by: Hu, Jinpeng, et al.
Published: (2025)

Think-Augmented Function Calling: Improving LLM Parameter Accuracy Through Embedded Reasoning
by: Wei, Lei, et al.
Published: (2026)

In-Context Examples Matter: Improving Emotion Recognition in Conversation with Instruction Tuning
by: Ma, Hui, et al.
Published: (2025)

Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions
by: Hu, Taojun, et al.
Published: (2024)

Understanding Layer Significance in LLM Alignment
by: Shi, Guangyuan, et al.
Published: (2024)

TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories
by: Dong, Honghua, et al.
Published: (2025)

DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
by: Zou, Anni, et al.
Published: (2024)

Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback
by: Zhu, Shijing, et al.
Published: (2025)

CSCE: Boosting LLM Reasoning by Simultaneous Enhancing of Causal Significance and Consistency
by: Wang, Kangsheng, et al.
Published: (2024)

Lost in the Mix: Evaluating LLM Understanding of Code-Switched Text
by: Mohamed, Amr, et al.
Published: (2025)

LLM Hallucination Detection: HSAD
by: Li, JinXin, et al.
Published: (2025)

ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction
by: Jin, Yiqiao, et al.
Published: (2025)

LLM-based NLG Evaluation: Current Status and Challenges
by: Gao, Mingqi, et al.
Published: (2024)

Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
by: He, Chengbo, et al.
Published: (2024)

MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors
by: Luo, Xiaotian, et al.
Published: (2026)

Explaining Length Bias in LLM-Based Preference Evaluations
by: Hu, Zhengyu, et al.
Published: (2024)

LLM-Guided Strategy Synthesis for Scalable Equality Saturation
by: Yin, Chenyun, et al.
Published: (2026)

Evaluating Human Alignment and Model Faithfulness of LLM Rationale
by: Fayyaz, Mohsen, et al.
Published: (2024)

Code-Switching Red-Teaming: LLM Evaluation for Safety and Multilingual Understanding
by: Yoo, Haneul, et al.
Published: (2024)

Code Fingerprints: Disentangled Attribution of LLM-Generated Code
by: Guo, Jiaxun, et al.
Published: (2026)

HEART-Bench: Do LLM Agents Exhibit Human-like Psychology?
by: Peng, Weihan, et al.
Published: (2026)

Benchmarking LLM Guardrails in Handling Multilingual Toxicity
by: Yang, Yahan, et al.
Published: (2024)

WEST: LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
by: Zhang, Binbin, et al.
Published: (2025)

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
by: Cao, Yixin, et al.
Published: (2025)

Are LLM-based Evaluators Confusing NLG Quality Criteria?
by: Hu, Xinyu, et al.
Published: (2024)

Citation-Enhanced Generation for LLM-based Chatbots
by: Li, Weitao, et al.
Published: (2024)

Skill-Conditioned Gated Self-Distillation for LLM Reasoning
by: Huang, Jiazhen, et al.
Published: (2026)

LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning
by: Meng, Silin, et al.
Published: (2024)

LLM-MedQA: Enhancing Medical Question Answering through Case Studies in Large Language Models
by: Yang, Hang, et al.
Published: (2024)

DuanzAI: Slang-Enhanced LLM with Prompt for Humor Understanding
by: Rohn, Yesian
Published: (2024)

BoRP: Bootstrapped Regression Probing for Scalable and Human-Aligned LLM Evaluation
by: Sun, Peng, et al.
Published: (2026)

Understanding LLM Embeddings for Regression
by: Tang, Eric, et al.
Published: (2024)

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation
by: Lin, Fan, et al.
Published: (2024)

Exploring LLM Multi-Agents for ICD Coding
by: Li, Rumeng, et al.
Published: (2024)

PyramidInfer: Pyramid KV Cache Compression for High-throughput LLM Inference
by: Yang, Dongjie, et al.
Published: (2024)

MIRAI: Evaluating LLM Agents for Event Forecasting
by: Ye, Chenchen, et al.
Published: (2024)

HuggingGraph: Understanding the Supply Chain of LLM Ecosystem
by: Rahman, Mohammad Shahedur, et al.
Published: (2025)

HTAA: Enhancing LLM Planning via Hybrid Toolset Agentization & Adaptation
by: Huang, Chengrui, et al.
Published: (2026)