:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dong, Daize, Chen, Junlin, Jia, Haolong, Wu, Jiawei, Di, Huanwei, Liu, Jiang, Wu, Jialian, Liu, Zhengzhong, Liu, Zicheng, Barsoum, Emad, Metaxas, Dimitris N., Wang, Hongyi
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2606.00395
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TTT-Bench: A Benchmark for Evaluating Reasoning Ability with Simple and Novel Tic-Tac-Toe-style Games
by: Mishra, Prakamya, et al.
Published: (2025)

Agent Laboratory: Using LLM Agents as Research Assistants
by: Schmidgall, Samuel, et al.
Published: (2025)

ReLibra: Routing-Replay-Guided Load Balancing for MoE Training in Reinforcement Learning
by: Jin, Chao, et al.
Published: (2026)

VideoSeek: Long-Horizon Video Agent with Tool-Guided Seeking
by: Lin, Jingyang, et al.
Published: (2026)

Learning from Online Videos at Inference Time for Computer-Use Agents
by: Liu, Yujian, et al.
Published: (2025)

Latent Visual Reasoning
by: Li, Bangzheng, et al.
Published: (2025)

XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models
by: Wang, Xingrui, et al.
Published: (2025)

Instella-T2I: Pushing the Limits of 1D Discrete Latent Space Image Generation
by: Wang, Ze, et al.
Published: (2025)

ImageDoctor: Diagnosing Text-to-Image Generation via Grounded Image Reasoning
by: Guo, Yuxiang, et al.
Published: (2025)

Self-Taught Agentic Long Context Understanding
by: Zhuang, Yufan, et al.
Published: (2025)

DRIFT: Transferring Reasoning Priors for Efficient MLLM Fine-Tuning
by: Huang, Chao, et al.
Published: (2025)

KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
by: Wang, Xingrui, et al.
Published: (2025)

CD4LM: Consistency Distillation and aDaptive Decoding for Diffusion Language Models
by: Liang, Yihao, et al.
Published: (2026)

MOVi: Training-free Text-conditioned Multi-Object Video Generation
by: Rahman, Aimon, et al.
Published: (2025)

AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection
by: Ray, Pretam, et al.
Published: (2026)

TaDA: Training-free recipe for Decoding with Adaptive KV Cache Compression and Mean-centering
by: Joshi, Vinay, et al.
Published: (2025)

Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE
by: Huang, Haiduo, et al.
Published: (2025)

Unleashing Hour-Scale Video Training for Long Video-Language Understanding
by: Lin, Jingyang, et al.
Published: (2025)

Pause and Think: A Dataset and Benchmark for Video-Grounded Assistive Action Suggestion
by: Singh, Shivam, et al.
Published: (2026)

SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers
by: Manem, Chaitanya, et al.
Published: (2025)

DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training
by: Jin, Can, et al.
Published: (2025)

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference
by: Singh, Aditya Kumar, et al.
Published: (2026)

TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents
by: Zhu, Kaijie, et al.
Published: (2026)

Instella: Fully Open Language Models with Stellar Performance
by: Liu, Jiang, et al.
Published: (2025)

PARD: Accelerating LLM Inference with Low-Cost PARallel Draft Model Adaptation
by: An, Zihao, et al.
Published: (2025)

EMO: Frustratingly Easy Progressive Training of Extendable MoE
by: Jin, Linghao, et al.
Published: (2026)

Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance
by: Wei, Yujie, et al.
Published: (2025)

APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation
by: Zhou, Yuzhen, et al.
Published: (2025)

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
by: Zhu, Tong, et al.
Published: (2024)

LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
by: Qu, Xiaoye, et al.
Published: (2024)

Instantaneous Perception of Moving Objects in 3D
by: Liu, Di, et al.
Published: (2024)

CaptionQA: Is Your Caption as Useful as the Image Itself?
by: Yang, Shijia, et al.
Published: (2025)

Reliable Use of Lemmas via Eligibility Reasoning and Section$-$Aware Reinforcement Learning
by: Xu, Zhikun, et al.
Published: (2026)

STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation
by: Wang, Yiming, et al.
Published: (2025)

D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
by: Wang, Haodong, et al.
Published: (2025)

Grouter: Decoupling Routing from Representation for Accelerated MoE Training
by: Xu, Yuqi, et al.
Published: (2026)

Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
by: Yue, Tongtian, et al.
Published: (2024)

Input Domain Aware MoE: Decoupling Routing Decisions from Task Optimization in Mixture of Experts
by: Hua, Yongxiang, et al.
Published: (2025)

Stabilizing Efficient Reasoning with Step-Level Advantage Selection
by: Wang, Han, et al.
Published: (2026)

Token Level Routing Inference System for Edge Devices
by: She, Jianshu, et al.
Published: (2025)