Saved in:
| Main Authors: | Dong, Daize, Chen, Junlin, Jia, Haolong, Wu, Jiawei, Di, Huanwei, Liu, Jiang, Wu, Jialian, Liu, Zhengzhong, Liu, Zicheng, Barsoum, Emad, Metaxas, Dimitris N., Wang, Hongyi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.00395 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games
by: Mishra, Prakamya, et al.
Published: (2025)
by: Mishra, Prakamya, et al.
Published: (2025)
Agent Laboratory: Using LLM Agents as Research Assistants
by: Schmidgall, Samuel, et al.
Published: (2025)
by: Schmidgall, Samuel, et al.
Published: (2025)
ReLibra: Routing-Replay-Guided Load Balancing for MoE Training in Reinforcement Learning
by: Jin, Chao, et al.
Published: (2026)
by: Jin, Chao, et al.
Published: (2026)
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
by: Lin, Jingyang, et al.
Published: (2026)
by: Lin, Jingyang, et al.
Published: (2026)
Learning from Online Videos at Inference Time for Computer-Use Agents
by: Liu, Yujian, et al.
Published: (2025)
by: Liu, Yujian, et al.
Published: (2025)
Latent Visual Reasoning
by: Li, Bangzheng, et al.
Published: (2025)
by: Li, Bangzheng, et al.
Published: (2025)
XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models
by: Wang, Xingrui, et al.
Published: (2025)
by: Wang, Xingrui, et al.
Published: (2025)
Instella-T2I: Pushing the Limits of 1D Discrete Latent Space Image Generation
by: Wang, Ze, et al.
Published: (2025)
by: Wang, Ze, et al.
Published: (2025)
ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning
by: Guo, Yuxiang, et al.
Published: (2025)
by: Guo, Yuxiang, et al.
Published: (2025)
Self-Taught Agentic Long Context Understanding
by: Zhuang, Yufan, et al.
Published: (2025)
by: Zhuang, Yufan, et al.
Published: (2025)
DRIFT: Transferring Reasoning Priors for Efficient MLLM Fine-Tuning
by: Huang, Chao, et al.
Published: (2025)
by: Huang, Chao, et al.
Published: (2025)
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
by: Wang, Xingrui, et al.
Published: (2025)
by: Wang, Xingrui, et al.
Published: (2025)
CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
by: Liang, Yihao, et al.
Published: (2026)
by: Liang, Yihao, et al.
Published: (2026)
MOVi: Training-free Text-conditioned Multi-Object Video Generation
by: Rahman, Aimon, et al.
Published: (2025)
by: Rahman, Aimon, et al.
Published: (2025)
AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection
by: Ray, Pretam, et al.
Published: (2026)
by: Ray, Pretam, et al.
Published: (2026)
TaDA: Training-free recipe for Decoding with Adaptive KV Cache Compression and Mean-centering
by: Joshi, Vinay, et al.
Published: (2025)
by: Joshi, Vinay, et al.
Published: (2025)
Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE
by: Huang, Haiduo, et al.
Published: (2025)
by: Huang, Haiduo, et al.
Published: (2025)
Unleashing Hour-Scale Video Training for Long Video-Language Understanding
by: Lin, Jingyang, et al.
Published: (2025)
by: Lin, Jingyang, et al.
Published: (2025)
Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion
by: Singh, Shivam, et al.
Published: (2026)
by: Singh, Shivam, et al.
Published: (2026)
SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers
by: Manem, Chaitanya, et al.
Published: (2025)
by: Manem, Chaitanya, et al.
Published: (2025)
DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training
by: Jin, Can, et al.
Published: (2025)
by: Jin, Can, et al.
Published: (2025)
DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
by: Singh, Aditya Kumar, et al.
Published: (2026)
by: Singh, Aditya Kumar, et al.
Published: (2026)
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents
by: Zhu, Kaijie, et al.
Published: (2026)
by: Zhu, Kaijie, et al.
Published: (2026)
Instella: Fully Open Language Models with Stellar Performance
by: Liu, Jiang, et al.
Published: (2025)
by: Liu, Jiang, et al.
Published: (2025)
PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
by: An, Zihao, et al.
Published: (2025)
by: An, Zihao, et al.
Published: (2025)
EMO: Frustratingly Easy Progressive Training of Extendable MoE
by: Jin, Linghao, et al.
Published: (2026)
by: Jin, Linghao, et al.
Published: (2026)
Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance
by: Wei, Yujie, et al.
Published: (2025)
by: Wei, Yujie, et al.
Published: (2025)
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation
by: Zhou, Yuzhen, et al.
Published: (2025)
by: Zhou, Yuzhen, et al.
Published: (2025)
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
by: Zhu, Tong, et al.
Published: (2024)
by: Zhu, Tong, et al.
Published: (2024)
LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
by: Qu, Xiaoye, et al.
Published: (2024)
by: Qu, Xiaoye, et al.
Published: (2024)
Instantaneous Perception of Moving Objects in 3D
by: Liu, Di, et al.
Published: (2024)
by: Liu, Di, et al.
Published: (2024)
CaptionQA: Is Your Caption as Useful as the Image Itself?
by: Yang, Shijia, et al.
Published: (2025)
by: Yang, Shijia, et al.
Published: (2025)
Reliable Use of Lemmas via Eligibility Reasoning and Section$-$Aware Reinforcement Learning
by: Xu, Zhikun, et al.
Published: (2026)
by: Xu, Zhikun, et al.
Published: (2026)
STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
by: Wang, Haodong, et al.
Published: (2025)
by: Wang, Haodong, et al.
Published: (2025)
Grouter: Decoupling Routing from Representation for Accelerated MoE Training
by: Xu, Yuqi, et al.
Published: (2026)
by: Xu, Yuqi, et al.
Published: (2026)
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
by: Yue, Tongtian, et al.
Published: (2024)
by: Yue, Tongtian, et al.
Published: (2024)
Input Domain Aware MoE: Decoupling Routing Decisions from Task Optimization in Mixture of Experts
by: Hua, Yongxiang, et al.
Published: (2025)
by: Hua, Yongxiang, et al.
Published: (2025)
Stabilizing Efficient Reasoning with Step-Level Advantage Selection
by: Wang, Han, et al.
Published: (2026)
by: Wang, Han, et al.
Published: (2026)
Token Level Routing Inference System for Edge Devices
by: She, Jianshu, et al.
Published: (2025)
by: She, Jianshu, et al.
Published: (2025)
Similar Items
-
TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games
by: Mishra, Prakamya, et al.
Published: (2025) -
Agent Laboratory: Using LLM Agents as Research Assistants
by: Schmidgall, Samuel, et al.
Published: (2025) -
ReLibra: Routing-Replay-Guided Load Balancing for MoE Training in Reinforcement Learning
by: Jin, Chao, et al.
Published: (2026) -
VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
by: Lin, Jingyang, et al.
Published: (2026) -
Learning from Online Videos at Inference Time for Computer-Use Agents
by: Liu, Yujian, et al.
Published: (2025)