Saved in:
| Main Authors: | Yang, Qianyu, Liu, Yang, Li, Jiaqi, Bai, Jun, Chen, Hao, Chen, Kaiyuan, Duan, Tiliang, Dong, Jiayun, Hu, Xiaobo, Jia, Zixia, Peng, Tao, Ren, Yixin, Tian, Ran, Wang, Zaiyuan, Xiao, Yanglihong, Yao, Gang, Yin, Lingyue, Zhang, Ge, Zhang, Chun, Jiao, Jianpeng, Zheng, Zilong, Gong, Yuan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.07980 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
by: Bai, Jun, et al.
Published: (2025)
by: Bai, Jun, et al.
Published: (2025)
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
by: Wang, Xiaobo, et al.
Published: (2025)
by: Wang, Xiaobo, et al.
Published: (2025)
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection
by: Li, Jiaqi, et al.
Published: (2025)
by: Li, Jiaqi, et al.
Published: (2025)
FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
by: Hu, Liang, et al.
Published: (2025)
by: Hu, Liang, et al.
Published: (2025)
MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
by: Yang, Chenghao, et al.
Published: (2025)
by: Yang, Chenghao, et al.
Published: (2025)
RAM: Towards an Ever-Improving Memory System by Learning from Communications
by: Li, Jiaqi, et al.
Published: (2024)
by: Li, Jiaqi, et al.
Published: (2024)
The AI Hippocampus: How Far are We From Human Memory?
by: Jia, Zixia, et al.
Published: (2026)
by: Jia, Zixia, et al.
Published: (2026)
NarrativeLoom: Enhancing Creative Storytelling through Multi-Persona Collaborative Improvisation
by: Ma, Yuxi, et al.
Published: (2026)
by: Ma, Yuxi, et al.
Published: (2026)
Xetrieval: Mechanistically Explaining Dense Retrieval
by: Cai, Zhixin, et al.
Published: (2026)
by: Cai, Zhixin, et al.
Published: (2026)
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling
by: Liu, Yang, et al.
Published: (2025)
by: Liu, Yang, et al.
Published: (2025)
TongSearch-QR: Reinforced Query Reasoning for Retrieval
by: Qin, Xubo, et al.
Published: (2025)
by: Qin, Xubo, et al.
Published: (2025)
Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels
by: Jia, Zixia, et al.
Published: (2024)
by: Jia, Zixia, et al.
Published: (2024)
Domain Adversarial Active Learning for Domain Generalization Classification
by: Chen, Jianting, et al.
Published: (2024)
by: Chen, Jianting, et al.
Published: (2024)
Anomalous Chern-Simons orbital magnetoelectric coupling of three-dimensional Chern insulators: gauge-discontinuity formalism and adiabatic pumping
by: Xue, Yang, et al.
Published: (2025)
by: Xue, Yang, et al.
Published: (2025)
Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment
by: He, Buwei, et al.
Published: (2025)
by: He, Buwei, et al.
Published: (2025)
Filtrations on the derived category of twisted K3 surfaces
by: Chen, Zaiyuan, et al.
Published: (2024)
by: Chen, Zaiyuan, et al.
Published: (2024)
LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics
by: Liu, Jiashuo, et al.
Published: (2025)
by: Liu, Jiashuo, et al.
Published: (2025)
DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
by: Zhao, Xiying, et al.
Published: (2025)
by: Zhao, Xiying, et al.
Published: (2025)
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
by: Zeng, Zhiyuan, et al.
Published: (2025)
by: Zeng, Zhiyuan, et al.
Published: (2025)
C2PSA-Enhanced YOLOv11 Architecture: A Novel Approach for Small Target Detection in Cotton Disease Diagnosis
by: Wang, Kaiyuan, et al.
Published: (2025)
by: Wang, Kaiyuan, et al.
Published: (2025)
How Far Are We? Systematic Evaluation of LLMs vs. Human Experts in Mathematical Contest in Modeling
by: Liu, Yuhang, et al.
Published: (2026)
by: Liu, Yuhang, et al.
Published: (2026)
The Challenges of Textbook Access at Chinese Transnational Universities
by: Ran, Congjin, et al.
Published: (2020)
by: Ran, Congjin, et al.
Published: (2020)
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
by: Ma, Yubo, et al.
Published: (2024)
by: Ma, Yubo, et al.
Published: (2024)
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers
by: Lou, Chao, et al.
Published: (2024)
by: Lou, Chao, et al.
Published: (2024)
In-Context Editing: Learning Knowledge from Self-Induced Distributions
by: Qi, Siyuan, et al.
Published: (2024)
by: Qi, Siyuan, et al.
Published: (2024)
Deep Pyoderma Caused by Serratia marcescens in a Border Collie in China
by: Ran Wang, et al.
Published: (2025)
by: Ran Wang, et al.
Published: (2025)
LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation
by: Zhu, Liya, et al.
Published: (2025)
by: Zhu, Liya, et al.
Published: (2025)
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning
by: Wu, Tong, et al.
Published: (2025)
by: Wu, Tong, et al.
Published: (2025)
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks
by: Yang, Yan, et al.
Published: (2025)
by: Yang, Yan, et al.
Published: (2025)
How Far are Modern Trackers from UAV-Anti-UAV? A Million-Scale Benchmark and New Baseline
by: Zhang, Chunhui, et al.
Published: (2025)
by: Zhang, Chunhui, et al.
Published: (2025)
The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement
by: Wang, Xiaobo, et al.
Published: (2026)
by: Wang, Xiaobo, et al.
Published: (2026)
OmniGenBench: A Benchmark for Omnipotent Multimodal Generation across 50+ Tasks
by: Wang, Jiayu, et al.
Published: (2025)
by: Wang, Jiayu, et al.
Published: (2025)
Xpertbench: Expert Level Tasks with Rubrics-Based Evaluation
by: Liu, Xue, et al.
Published: (2026)
by: Liu, Xue, et al.
Published: (2026)
IDEA-Bench: How Far are Generative Models from Professional Designing?
by: Liang, Chen, et al.
Published: (2024)
by: Liang, Chen, et al.
Published: (2024)
Low‐Hysteresis Self‐Powered Flexible Humidity Sensor Based on Sulfonated Graphene Oxide for Breath Monitoring
by: Zhuohuan Wu, et al.
Published: (2025)
by: Zhuohuan Wu, et al.
Published: (2025)
Mixture of A Million Experts
by: He, Xu Owen
Published: (2024)
by: He, Xu Owen
Published: (2024)
SkillGenBench: Benchmarking Skill Generation Pipelines for LLM Agents
by: Zhou, Yifan, et al.
Published: (2026)
by: Zhou, Yifan, et al.
Published: (2026)
Advancements in Single Atom Catalysts for Electrocatalytic Nitrate Reduction Reaction
by: Lingyue Liu, et al.
Published: (2024)
by: Lingyue Liu, et al.
Published: (2024)
Mean Curvature Flow for Isoparametric Submanifolds in Hyperbolic Spaces
by: Liu, Xiaobo, et al.
Published: (2025)
by: Liu, Xiaobo, et al.
Published: (2025)
Action of $W$-type operators on Schur functions and Schur Q-functions
by: Liu, Xiaobo, et al.
Published: (2022)
by: Liu, Xiaobo, et al.
Published: (2022)
Similar Items
-
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs
by: Bai, Jun, et al.
Published: (2025) -
Adaptive Preference Optimization with Uncertainty-aware Utility Anchor
by: Wang, Xiaobo, et al.
Published: (2025) -
ReflectEvo: Improving Meta Introspection of Small LLMs by Learning Self-Reflection
by: Li, Jiaqi, et al.
Published: (2025) -
FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
by: Hu, Liang, et al.
Published: (2025) -
MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation
by: Yang, Chenghao, et al.
Published: (2025)