:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Haofei, Hong, Zhaochen, Cheng, Zirui, Zhu, Kunlun, Xuan, Keyang, Yao, Jinwei, Feng, Tao, You, Jiaxuan
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2412.17767
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ResearchArcade: Graph Interface for Academic Tasks
by: Xu, Jingjun, et al.
Published: (2025)

ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities
by: Hong, Zhaochen, et al.
Published: (2025)

TinyScientist: An Interactive, Extensible, and Controllable Framework for Building Research Agents
by: Yu, Haofei, et al.
Published: (2025)

How Far Are We From AGI: Are LLMs All We Need?
by: Feng, Tao, et al.
Published: (2024)

SocialVeil: Probing Social Intelligence of Language Agents under Communication Barriers
by: Xuan, Keyang, et al.
Published: (2026)

Time-R1: Towards Comprehensive Temporal Reasoning in LLMs
by: Liu, Zijia, et al.
Published: (2025)

Debugging Tabular Log as Dynamic Graphs
by: Liang, Chumeng, et al.
Published: (2025)

DecompressionLM: Deterministic, Diagnostic, and Zero-Shot Concept Graph Extraction from Language Models
by: Hong, Zhaochen, et al.
Published: (2026)

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
by: Zhang, Haozhen, et al.
Published: (2025)

Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs
by: Zhang, Haozhen, et al.
Published: (2024)

Probing the Knowledge Boundary: An Interactive Agentic Framework for Deep Knowledge Extraction
by: Yang, Yuheng, et al.
Published: (2026)

GMTRouter: Personalized LLM Router over Multi-turn User Interactions
by: Xie, Encheng, et al.
Published: (2025)

AcademicEval: Live Long-Context LLM Benchmark
by: Zhang, Haozhen, et al.
Published: (2025)

Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)

Sotopia-RL: Reward Design for Social Intelligence
by: Yu, Haofei, et al.
Published: (2025)

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
by: Song, Mingyang, et al.
Published: (2025)

Beyond Facts: Evaluating Intent Hallucination in Large Language Models
by: Hao, Yijie, et al.
Published: (2025)

ArtifactLinker: Linking Scientific Artifacts for Automatic State-of-the-Art Discovery
by: Yu, Haofei, et al.
Published: (2026)

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
by: Liu, Yong, et al.
Published: (2024)

ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
by: Yang, Ling, et al.
Published: (2025)

LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling
by: Chen, Yuxin, et al.
Published: (2026)

LiveTradeBench: Seeking Real-World Alpha with Large Language Models
by: Yu, Haofei, et al.
Published: (2025)

GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
by: Feng, Tao, et al.
Published: (2025)

PersonalizedRouter: Personalized LLM Routing via Graph-based User Preference Modeling
by: Dai, Zhongjie, et al.
Published: (2025)

MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training
by: Luo, Yang, et al.
Published: (2025)

Table as Thought: Exploring Structured Thoughts in LLM Reasoning
by: Sun, Zhenjie, et al.
Published: (2025)

MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents
by: Zhu, Kunlun, et al.
Published: (2025)

Learning Query-Aware Budget-Tier Routing for Runtime Agent Memory
by: Zhang, Haozhen, et al.
Published: (2026)

In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
by: Han, Pengrui, et al.
Published: (2024)

EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis
by: Song, Xiaoshuai, et al.
Published: (2026)

Graph World Model
by: Feng, Tao, et al.
Published: (2025)

Synthetic Computers at Scale for Long-Horizon Productivity Simulation
by: Ge, Tao, et al.
Published: (2026)

Evolutionary Guided Decoding: Iterative Value Refinement for LLMs
by: Liu, Zhenhua, et al.
Published: (2025)

Learning to Reason as Action Abstractions with Scalable Mid-Training RL
by: Zhang, Shenao, et al.
Published: (2025)

Reinforcing Human Behavior Simulation via Verbal Feedback
by: Sun, Weiwei, et al.
Published: (2026)

NoveltyRank: A Retrieval-Augmented Framework for Conceptual Novelty Estimation in AI Research
by: Yan, Zhengxu, et al.
Published: (2025)

CreditAudit: 2$^\text{nd}$ Dimension for LLM Evaluation and Selection
by: Song, Yiliang, et al.
Published: (2026)

Multi-Scale Heterogeneous Text-Attributed Graph Datasets From Diverse Domains
by: Liu, Yunhui, et al.
Published: (2024)

TaeBench: Improving Quality of Toxic Adversarial Examples
by: Zhu, Xuan, et al.
Published: (2024)

Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research
by: Huo, Shuning, et al.
Published: (2024)