Saved in:
| Main Authors: | Liu, Zirui, Li, Jiatong, Zhuang, Yan, Liu, Qi, Shen, Shuanghong, Ouyang, Jie, Cheng, Mingyue, Wang, Shijin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.03475 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Stable and Structured Time Series Generation with Perturbation-Aware Flow Matching
by: Zhang, Jintao, et al.
Published: (2025)
by: Zhang, Jintao, et al.
Published: (2025)
MemCast: Memory-Driven Time Series Forecasting with Experience-Conditioned Reasoning
by: Tao, Xiaoyu, et al.
Published: (2026)
by: Tao, Xiaoyu, et al.
Published: (2026)
StaTS: Spectral Trajectory Schedule Learning for Adaptive Time Series Forecasting with Frequency Guided Denoiser
by: Zhang, Jintao, et al.
Published: (2026)
by: Zhang, Jintao, et al.
Published: (2026)
From Values to Tokens: An LLM-Driven Framework for Context-aware Time Series Forecasting via Symbolic Discretization
by: Tao, Xiaoyu, et al.
Published: (2025)
by: Tao, Xiaoyu, et al.
Published: (2025)
Survey of Computerized Adaptive Testing: A Machine Learning Perspective
by: Zhuang, Yan, et al.
Published: (2024)
by: Zhuang, Yan, et al.
Published: (2024)
PaperArena: An Evaluation Benchmark for Tool-Augmented Agentic Reasoning on Scientific Literature
by: Wang, Daoyu, et al.
Published: (2025)
by: Wang, Daoyu, et al.
Published: (2025)
DASKT: A Dynamic Affect Simulation Method for Knowledge Tracing
by: Sun, Xinjie, et al.
Published: (2025)
by: Sun, Xinjie, et al.
Published: (2025)
Generative Cognitive Diagnosis
by: Li, Jiatong, et al.
Published: (2025)
by: Li, Jiatong, et al.
Published: (2025)
InstructTime++: Time Series Classification with Multimodal Language Modeling via Implicit Feature Enhancement
by: Cheng, Mingyue, et al.
Published: (2026)
by: Cheng, Mingyue, et al.
Published: (2026)
TestAgent: An Adaptive and Intelligent Expert for Human Assessment
by: Yu, Junhao, et al.
Published: (2025)
by: Yu, Junhao, et al.
Published: (2025)
SHADE-Arena: Evaluating Sabotage and Monitoring in LLM Agents
by: Kutasov, Jonathan, et al.
Published: (2025)
by: Kutasov, Jonathan, et al.
Published: (2025)
Scheduled Knowledge Acquisition on Lightweight Vector Symbolic Architectures for Brain-Computer Interfaces
by: Liu, Yejia, et al.
Published: (2024)
by: Liu, Yejia, et al.
Published: (2024)
BrowserArena: Evaluating LLM Agents on Real-World Web Navigation Tasks
by: Anupam, Sagnik, et al.
Published: (2025)
by: Anupam, Sagnik, et al.
Published: (2025)
Can Slow-thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting
by: Cheng, Mingyue, et al.
Published: (2025)
by: Cheng, Mingyue, et al.
Published: (2025)
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025)
by: Sun, Shengjie, et al.
Published: (2025)
TriP-LLM: A Tri-Branch Patch-wise Large Language Model Framework for Time-Series Anomaly Detection
by: Yu, Yuan-Cheng, et al.
Published: (2025)
by: Yu, Yuan-Cheng, et al.
Published: (2025)
Are Your Reasoning Models Reasoning or Guessing? A Mechanistic Analysis of Hierarchical Reasoning Models
by: Ren, Zirui, et al.
Published: (2026)
by: Ren, Zirui, et al.
Published: (2026)
TextArena
by: Guertler, Leon, et al.
Published: (2025)
by: Guertler, Leon, et al.
Published: (2025)
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs
by: Zhou, Yitong, et al.
Published: (2025)
by: Zhou, Yitong, et al.
Published: (2025)
GHPO: Adaptive Guidance for Stable and Efficient LLM Reinforcement Learning
by: Liu, Ziru, et al.
Published: (2025)
by: Liu, Ziru, et al.
Published: (2025)
Unified Uncertainty Estimation for Cognitive Diagnosis Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
MiMu: Mitigating Multiple Shortcut Learning Behavior of Transformers
by: Zhao, Lili, et al.
Published: (2025)
by: Zhao, Lili, et al.
Published: (2025)
CirrusBench: Evaluating LLM-based Agents Beyond Correctness in Real-World Cloud Service Environments
by: Yu, Yi, et al.
Published: (2026)
by: Yu, Yi, et al.
Published: (2026)
Optimizing Student Ability Assessment: A Hierarchy Constraint-Aware Cognitive Diagnosis Framework for Educational Contexts
by: Sun, Xinjie, et al.
Published: (2024)
by: Sun, Xinjie, et al.
Published: (2024)
Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting
by: Pan, Bokai, et al.
Published: (2026)
by: Pan, Bokai, et al.
Published: (2026)
ChessArena: A Chess Testbed for Evaluating Strategic Reasoning Capabilities of Large Language Models
by: Liu, Jincheng, et al.
Published: (2025)
by: Liu, Jincheng, et al.
Published: (2025)
AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
by: Younesian, Sharareh, et al.
Published: (2026)
by: Younesian, Sharareh, et al.
Published: (2026)
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
by: Liu, Yong, et al.
Published: (2024)
by: Liu, Yong, et al.
Published: (2024)
TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation
by: Ouyang, Jialin
Published: (2025)
by: Ouyang, Jialin
Published: (2025)
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training
by: Shen, Guobin, et al.
Published: (2026)
by: Shen, Guobin, et al.
Published: (2026)
A Statistical Framework for Ranking LLM-Based Chatbots
by: Ameli, Siavash, et al.
Published: (2024)
by: Ameli, Siavash, et al.
Published: (2024)
Dependency-based Anomaly Detection: a General Framework and Comprehensive Evaluation
by: Lu, Sha, et al.
Published: (2020)
by: Lu, Sha, et al.
Published: (2020)
ArenaBencher: Automatic Benchmark Evolution via Multi-Model Competitive Evaluation
by: Liu, Qin, et al.
Published: (2025)
by: Liu, Qin, et al.
Published: (2025)
Refining Alignment Framework for Diffusion Models with Intermediate-Step Preference Ranking
by: Ren, Jie, et al.
Published: (2025)
by: Ren, Jie, et al.
Published: (2025)
Fast Graph Condensation with Structure-based Neural Tangent Kernel
by: Wang, Lin, et al.
Published: (2023)
by: Wang, Lin, et al.
Published: (2023)
Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level
by: Feng, Zhaopeng, et al.
Published: (2024)
by: Feng, Zhaopeng, et al.
Published: (2024)
Instance-level Randomization: Toward More Stable LLM Evaluations
by: Li, Yiyang, et al.
Published: (2025)
by: Li, Yiyang, et al.
Published: (2025)
ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking
by: Zhang, Qiang, et al.
Published: (2026)
by: Zhang, Qiang, et al.
Published: (2026)
JudgeBench: A Benchmark for Evaluating LLM-based Judges
by: Tan, Sijun, et al.
Published: (2024)
by: Tan, Sijun, et al.
Published: (2024)
Similar Items
-
Towards Stable and Structured Time Series Generation with Perturbation-Aware Flow Matching
by: Zhang, Jintao, et al.
Published: (2025) -
MemCast: Memory-Driven Time Series Forecasting with Experience-Conditioned Reasoning
by: Tao, Xiaoyu, et al.
Published: (2026) -
StaTS: Spectral Trajectory Schedule Learning for Adaptive Time Series Forecasting with Frequency Guided Denoiser
by: Zhang, Jintao, et al.
Published: (2026) -
From Values to Tokens: An LLM-Driven Framework for Context-aware Time Series Forecasting via Symbolic Discretization
by: Tao, Xiaoyu, et al.
Published: (2025) -
Survey of Computerized Adaptive Testing: A Machine Learning Perspective
by: Zhuang, Yan, et al.
Published: (2024)