:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lei, Jingdi, Zhang, Di, Li, Junxian, Wang, Weida, Fan, Kaixuan, Liu, Xiang, Liu, Qihan, Ma, Xiaoteng, Chen, Baian, Poria, Soujanya
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.12357
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Exact Flow Linear Attention: Exact Solution from Continuous-Time Dynamics
by: Lei, Jingdi, et al.
Published: (2025)

OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
by: Lei, Jingdi, et al.
Published: (2025)

Towards Robust Instruction Tuning on Multimodal Large Language Models
by: Han, Wei, et al.
Published: (2024)

Large Language Models for Automated Open-domain Scientific Hypotheses Discovery
by: Yang, Zonglin, et al.
Published: (2023)

Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
by: Bhardwaj, Rishabh, et al.
Published: (2024)

Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text Models
by: Han, Wei, et al.
Published: (2023)

Efficient Multi-agent Reinforcement Learning by Planning
by: Liu, Qihan, et al.
Published: (2024)

Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
by: Xuan, Richmond Sin Jing, et al.
Published: (2026)

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
by: Kang, Jaeyong, et al.
Published: (2023)

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models
by: Chia, Yew Ken, et al.
Published: (2024)

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
by: Yang, Zonglin, et al.
Published: (2024)

Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning
by: Ghosal, Deepanway, et al.
Published: (2024)

Toward Robust Multimodal Learning using Multimodal Foundational Models
by: Zhao, Xianbing, et al.
Published: (2024)

Stacked from One: Multi-Scale Self-Injection for Context Window Extension
by: Han, Wei, et al.
Published: (2026)

Two are better than one: Context window extension with multi-grained self-injection
by: Han, Wei, et al.
Published: (2024)

Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned
by: Ong, Brandon, et al.
Published: (2025)

Harnessing Large Language Models for Scientific Novelty Detection
by: Liu, Yan, et al.
Published: (2025)

Beyond What to Select: A Plug-and-play Oscillatory Data-Volume Scheduling for Efficient Model Training
by: Yang, Suorong, et al.
Published: (2026)

E-mem: Multi-agent based Episodic Context Reconstruction for LLM Agent Memory
by: Wang, Kaixiang, et al.
Published: (2026)

Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
by: Pala, Tej Deep, et al.
Published: (2025)

DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors
by: Rakib, Tazeek Bin Abdur, et al.
Published: (2025)

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
by: Toh, Vernon Y. H., et al.
Published: (2025)

Control-R: Towards controllable test-time scaling
by: Zhang, Di, et al.
Published: (2025)

VRoPE: Rotary Position Embedding for Video Large Language Models
by: Liu, Zikang, et al.
Published: (2025)

NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
by: Hung, Chia-Yu, et al.
Published: (2025)

Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming
by: Han, Vernon Toh Yan, et al.
Published: (2024)

Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models
by: Hazra, Rima, et al.
Published: (2024)

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models
by: Chen, Ruizhe, et al.
Published: (2025)

PREMISE: Matching-based Prediction for Accurate Review Recommendation
by: Han, Wei, et al.
Published: (2025)

Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
by: Toh, Vernon Y. H., et al.
Published: (2024)

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
by: Hazra, Rima, et al.
Published: (2024)

Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense
by: Shen, Siqi, et al.
Published: (2024)

NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
by: Hung, Chia-Yu, et al.
Published: (2025)

Reasoning Paths Optimization: Learning to Reason and Explore From Diverse Paths
by: Chia, Yew Ken, et al.
Published: (2024)

The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why -- A Survey from MARL to Emergent Language and LLMs
by: Chen, Jingdi, et al.
Published: (2026)

10 Open Challenges Steering the Future of Vision-Language-Action Models
by: Poria, Soujanya, et al.
Published: (2025)

DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
by: Deep, Pala Tej, et al.
Published: (2024)

WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models
by: Gupta, Prannaya, et al.
Published: (2024)

PMG : Personalized Multimodal Generation with Large Language Models
by: Shen, Xiaoteng, et al.
Published: (2024)

Memory-Efficient LLM Training with Online Subspace Descent
by: Liang, Kaizhao, et al.
Published: (2024)