Saved in:
| Main Authors: | Fan, Zhihao, Tang, Jialong, Chen, Wei, Wang, Siyuan, Wei, Zhongyu, Xi, Jun, Huang, Fei, Zhou, Jingren |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.09742 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction
by: Yue, Shengbin, et al.
Published: (2025)
by: Yue, Shengbin, et al.
Published: (2025)
Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game
by: Ye, Rong, et al.
Published: (2025)
by: Ye, Rong, et al.
Published: (2025)
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
PIORS: Personalized Intelligent Outpatient Reception based on Large Language Model with Multi-Agents Medical Scenario Simulation
by: Bao, Zhijie, et al.
Published: (2024)
by: Bao, Zhijie, et al.
Published: (2024)
AI-Press: A Multi-Agent News Generating and Feedback Simulation System Powered by Large Language Models
by: Liu, Xiawei, et al.
Published: (2024)
by: Liu, Xiawei, et al.
Published: (2024)
Synergistic Multi-Agent Framework with Trajectory Learning for Knowledge-Intensive Tasks
by: Yue, Shengbin, et al.
Published: (2024)
by: Yue, Shengbin, et al.
Published: (2024)
Symbolic Working Memory Enhances Language Models for Complex Rule Application
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
ALaRM: Align Language Models via Hierarchical Rewards Modeling
by: Lai, Yuhang, et al.
Published: (2024)
by: Lai, Yuhang, et al.
Published: (2024)
Unveiling the Truth and Facilitating Change: Towards Agent-based Large-scale Social Movement Simulation
by: Mou, Xinyi, et al.
Published: (2024)
by: Mou, Xinyi, et al.
Published: (2024)
SpeechMedAssist: Efficiently and Effectively Adapting Speech Language Models for Medical Consultation
by: Chen, Sirry, et al.
Published: (2026)
by: Chen, Sirry, et al.
Published: (2026)
EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models
by: Du, Mengfei, et al.
Published: (2024)
by: Du, Mengfei, et al.
Published: (2024)
EcoLANG: Efficient and Effective Agent Communication Language Induction for Social Simulation
by: Mou, Xinyi, et al.
Published: (2025)
by: Mou, Xinyi, et al.
Published: (2025)
LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation
by: Duan, Feiyu, et al.
Published: (2026)
by: Duan, Feiyu, et al.
Published: (2026)
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
by: Yuan, Chenhan, et al.
Published: (2024)
by: Yuan, Chenhan, et al.
Published: (2024)
Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models
by: Deng, Boyi, et al.
Published: (2026)
by: Deng, Boyi, et al.
Published: (2026)
CommunityBench: Benchmarking Community-Level Alignment across Diverse Groups and Tasks
by: Lin, Jiayu, et al.
Published: (2026)
by: Lin, Jiayu, et al.
Published: (2026)
AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs
by: Ding, Xuanwen, et al.
Published: (2025)
by: Ding, Xuanwen, et al.
Published: (2025)
HyLaT: Efficient Multi-Agent Communication via Hybrid Latent-Text Protocol
by: Mou, Xinyi, et al.
Published: (2026)
by: Mou, Xinyi, et al.
Published: (2026)
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs
by: Zhang, Yidan, et al.
Published: (2024)
by: Zhang, Yidan, et al.
Published: (2024)
Towards Reliable Medical LLMs: Benchmarking and Enhancing Confidence Estimation of Large Language Models in Medical Consultation
by: Ren, Zhiyao, et al.
Published: (2026)
by: Ren, Zhiyao, et al.
Published: (2026)
InsQABench: Benchmarking Chinese Insurance Domain Question Answering with Large Language Models
by: Ding, Jing, et al.
Published: (2025)
by: Ding, Jing, et al.
Published: (2025)
Hallucination Detection via Internal States and Structured Reasoning Consistency in Large Language Models
by: Song, Yusheng, et al.
Published: (2025)
by: Song, Yusheng, et al.
Published: (2025)
BaZi-Based Character Simulation Benchmark: Evaluating AI on Temporal and Persona Reasoning
by: Zheng, Siyuan, et al.
Published: (2025)
by: Zheng, Siyuan, et al.
Published: (2025)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
Stepwise Informativeness Search for Efficient and Effective LLM Reasoning
by: Wang, Siyuan, et al.
Published: (2025)
by: Wang, Siyuan, et al.
Published: (2025)
Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation
by: Cai, Jinyu, et al.
Published: (2024)
by: Cai, Jinyu, et al.
Published: (2024)
From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents
by: Mou, Xinyi, et al.
Published: (2024)
by: Mou, Xinyi, et al.
Published: (2024)
Direct Simultaneous Translation Activation for Large Audio-Language Models
by: Zhang, Pei, et al.
Published: (2025)
by: Zhang, Pei, et al.
Published: (2025)
Large Language Models are In-Context Molecule Learners
by: Li, Jiatong, et al.
Published: (2024)
by: Li, Jiatong, et al.
Published: (2024)
Strong Reasoning Isn't Enough: Evaluating Evidence Elicitation in Interactive Diagnosis
by: Long, Zhuohan, et al.
Published: (2026)
by: Long, Zhuohan, et al.
Published: (2026)
Large Language Model Benchmarks in Medical Tasks
by: Yan, Lawrence K. Q., et al.
Published: (2024)
by: Yan, Lawrence K. Q., et al.
Published: (2024)
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
by: Lu, Keming, et al.
Published: (2024)
by: Lu, Keming, et al.
Published: (2024)
DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning
by: Du, Mengfei, et al.
Published: (2024)
by: Du, Mengfei, et al.
Published: (2024)
VoCoT: Unleashing Visually Grounded Multi-Step Reasoning in Large Multi-Modal Models
by: Li, Zejun, et al.
Published: (2024)
by: Li, Zejun, et al.
Published: (2024)
AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios
by: Mou, Xinyi, et al.
Published: (2024)
by: Mou, Xinyi, et al.
Published: (2024)
Unveiling Linguistic Regions in Large Language Models
by: Zhang, Zhihao, et al.
Published: (2024)
by: Zhang, Zhihao, et al.
Published: (2024)
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
by: Ye, Jiabo, et al.
Published: (2024)
by: Ye, Jiabo, et al.
Published: (2024)
A Survey on Self-Evolution of Large Language Models
by: Tao, Zhengwei, et al.
Published: (2024)
by: Tao, Zhengwei, et al.
Published: (2024)
ElectionSim: Massive Population Election Simulation Powered by Large Language Model Driven Agents
by: Zhang, Xinnong, et al.
Published: (2024)
by: Zhang, Xinnong, et al.
Published: (2024)
Similar Items
-
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation
by: Wang, Siyuan, et al.
Published: (2024) -
Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction
by: Yue, Shengbin, et al.
Published: (2025) -
Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game
by: Ye, Rong, et al.
Published: (2025) -
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
by: Wang, Siyuan, et al.
Published: (2024) -
PIORS: Personalized Intelligent Outpatient Reception based on Large Language Model with Multi-Agents Medical Scenario Simulation
by: Bao, Zhijie, et al.
Published: (2024)