:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Ruihan, Gao, Yuting, Wang, Lan, Li, Jianing, Chen, Weihao, Guo, Qingpei, Yang, Ming, Zhang, Shiliang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.09080
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

OrdMoE: Preference Alignment via Hierarchical Expert Group Ranking in Multimodal Mixture-of-Experts LLMs
by: Gao, Yuting, et al.
Published: (2025)

FlattenGPT: Depth Compression for Transformer with Layer Flattening
by: Xu, Ruihan, et al.
Published: (2026)

Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
by: Xuan, Shiyu, et al.
Published: (2023)

AnyExperts: On-Demand Expert Allocation for Multimodal Language Models with Mixture of Expert
by: Gao, Yuting, et al.
Published: (2025)

LoopQ: Quantization for Recursive Transformers
by: Fang, Rui, et al.
Published: (2026)

SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
by: Ma, Ziping, et al.
Published: (2024)

PixelGen: Improving Pixel Diffusion with Perceptual Supervision
by: Ma, Zehong, et al.
Published: (2026)

One Step Forward and K Steps Back: Better Reasoning with Denoising Recursion Models
by: Cameron, Chris, et al.
Published: (2026)

EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models
by: Jing, Linglin, et al.
Published: (2025)

M2-Encoder: Advancing Bilingual Image-Text Understanding by Large-scale Efficient Pretraining
by: Guo, Qingpei, et al.
Published: (2024)

Step Back to Leap Forward: Self-Backtracking for Boosting Reasoning of Language Models
by: Yang, Xiao-Wen, et al.
Published: (2025)

NN-Former: Rethinking Graph Structure in Neural Architecture Representation
by: Xu, Ruihan, et al.
Published: (2025)

TabKANet: Tabular Data Modeling with Kolmogorov-Arnold Network and Transformer
by: Gao, Weihao, et al.
Published: (2024)

VaccineRAG: Boosting Multimodal Large Language Models' Immunity to Harmful RAG Samples
by: Sun, Qixin, et al.
Published: (2025)

Moving Forward, Looking Back
by: Hagener, Malte
Published: (2010)

BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving
by: Liu, Shu, et al.
Published: (2025)

IG-MCTS: Human-in-the-Loop Cooperative Navigation under Incomplete Information
by: Chen, Shenghui, et al.
Published: (2025)

SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition
by: Xu, Peiran, et al.
Published: (2025)

Social Debiasing for Fair Multi-modal LLMs
by: Cheng, Harry, et al.
Published: (2024)

Dual Tuning for Reasoning Efficacy-Driven Data Curation in Multimodal LLM Training
by: Zheng, Ruobing, et al.
Published: (2026)

LAST: Leveraging Tools as Hints to Enhance Spatial Reasoning for Multimodal Large Language Models
by: Tian, Shi-Yu, et al.
Published: (2026)

Computing: Looking Back and Moving Forward
by: Golec, Muhammed, et al.
Published: (2024)

SyncSpeech: Efficient and Low-Latency Text-to-Speech based on Temporal Masked Transformer
by: Sheng, Zhengyan, et al.
Published: (2025)

What Makes Looped Transformers Perform Better Than Non-Recursive Ones
by: Gong, Zixuan, et al.
Published: (2025)

RLinf: Flexible and Efficient Large-scale Reinforcement Learning via Macro-to-Micro Flow Transformation
by: Yu, Chao, et al.
Published: (2025)

Benchmarking Open-Source Large Language Models on Healthcare Text Classification Tasks
by: Guo, Yuting, et al.
Published: (2025)

Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion
by: Lan, Pengxiang, et al.
Published: (2024)

Resource-Efficient Reinforcement for Reasoning Large Language Models via Dynamic One-Shot Policy Refinement
by: Zhang, Yunjian, et al.
Published: (2026)

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
by: Botev, Aleksandar, et al.
Published: (2024)

You Only Forward Once: An Efficient Compositional Judging Paradigm
by: Zhang, Tianlong, et al.
Published: (2025)

LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models
by: Zhi, Yijie, et al.
Published: (2025)

Automatic Instruction Evolving for Large Language Models
by: Zeng, Weihao, et al.
Published: (2024)

Enabling Real-Time Colonoscopic Polyp Segmentation on Commodity CPUs via Ultra-Lightweight Architecture
by: Gao, Weihao, et al.
Published: (2026)

Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models
by: Vendrell, Victor Conchello, et al.
Published: (2026)

XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models
by: Dong, Yixin, et al.
Published: (2024)

ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents
by: Zhang, Zhenyu, et al.
Published: (2025)

Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models
by: Fang, Chengyu, et al.
Published: (2026)

VideoScaffold: Elastic-Scale Visual Hierarchies for Streaming Video Understanding in MLLMs
by: Zheng, Naishan, et al.
Published: (2025)

Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward
by: Xu, Renjun, et al.
Published: (2026)

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
by: Yu, Weihao, et al.
Published: (2023)