Saved in:
| Main Authors: | Ouyang, Kun, Liu, Yuanxin, Yao, Linli, Cai, Yishuo, Zhou, Hao, Zhou, Jie, Meng, Fandong, Sun, Xu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.20470 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SpaceR: Reinforcing MLLMs in Video Spatial Reasoning
by: Ouyang, Kun, et al.
Published: (2025)
by: Ouyang, Kun, et al.
Published: (2025)
PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
by: Ouyang, Kun, et al.
Published: (2024)
by: Ouyang, Kun, et al.
Published: (2024)
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
by: Wang, Yuchi, et al.
Published: (2025)
by: Wang, Yuchi, et al.
Published: (2025)
Figure It Out: Improve the Frontier of Reasoning with Executable Visual States
by: Chen, Meiqi, et al.
Published: (2025)
by: Chen, Meiqi, et al.
Published: (2025)
Video Understanding Reward Modeling: A Robust Benchmark and Performant Reward Models
by: Wei, Yuancheng, et al.
Published: (2026)
by: Wei, Yuancheng, et al.
Published: (2026)
Investigating Cross-Modal Skill Injection: Scenarios, Methods, and Hyperparameters
by: Xu, Zhiyu, et al.
Published: (2026)
by: Xu, Zhiyu, et al.
Published: (2026)
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)
by: Wang, Jiaan, et al.
Published: (2025)
ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)
by: Wang, Jiaan, et al.
Published: (2025)
Continuous Visual Autoregressive Generation via Score Maximization
by: Shao, Chenze, et al.
Published: (2025)
by: Shao, Chenze, et al.
Published: (2025)
EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation
by: Xu, Yulin, et al.
Published: (2022)
by: Xu, Yulin, et al.
Published: (2022)
DRT: Deep Reasoning Translation via Long Chain-of-Thought
by: Wang, Jiaan, et al.
Published: (2024)
by: Wang, Jiaan, et al.
Published: (2024)
TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions
by: Yao, Linli, et al.
Published: (2026)
by: Yao, Linli, et al.
Published: (2026)
Towards Codable Watermarking for Injecting Multi-bits Information to LLMs
by: Wang, Lean, et al.
Published: (2023)
by: Wang, Lean, et al.
Published: (2023)
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
by: Yang, Zhen, et al.
Published: (2023)
by: Yang, Zhen, et al.
Published: (2023)
Temporal Reasoning Transfer from Text to Video
by: Li, Lei, et al.
Published: (2024)
by: Li, Lei, et al.
Published: (2024)
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers
by: Hu, Yong, et al.
Published: (2022)
by: Hu, Yong, et al.
Published: (2022)
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
by: Liu, Yijin, et al.
Published: (2024)
by: Liu, Yijin, et al.
Published: (2024)
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation
by: Liang, Yunlong, et al.
Published: (2025)
by: Liang, Yunlong, et al.
Published: (2025)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Efficient Covariance Estimation for Sparsified Functional Data
by: Zheng, Sijie, et al.
Published: (2025)
by: Zheng, Sijie, et al.
Published: (2025)
Readability-Robust Code Summarization via Meta Curriculum Learning
by: Zeng, Wenhao, et al.
Published: (2026)
by: Zeng, Wenhao, et al.
Published: (2026)
Outdated Issue Aware Decoding for Reasoning Questions on Edited Knowledge
by: Sun, Zengkui, et al.
Published: (2024)
by: Sun, Zengkui, et al.
Published: (2024)
Large Language Models Are Not Robust Multiple Choice Selectors
by: Zheng, Chujie, et al.
Published: (2023)
by: Zheng, Chujie, et al.
Published: (2023)
MiniPLM: Knowledge Distillation for Pre-Training Language Models
by: Gu, Yuxian, et al.
Published: (2024)
by: Gu, Yuxian, et al.
Published: (2024)
UME-R1: Exploring Reasoning-Driven Generative Multimodal Embeddings
by: Lan, Zhibin, et al.
Published: (2025)
by: Lan, Zhibin, et al.
Published: (2025)
LaCo: Efficient Layer-wise Compression of Visual Tokens for Multimodal Large Language Models
by: Liu, Juntao, et al.
Published: (2025)
by: Liu, Juntao, et al.
Published: (2025)
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
by: Yao, Linli, et al.
Published: (2025)
by: Yao, Linli, et al.
Published: (2025)
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
by: Liu, Yuanxin, et al.
Published: (2025)
by: Liu, Yuanxin, et al.
Published: (2025)
A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences
by: Shen, Jiaxin, et al.
Published: (2025)
by: Shen, Jiaxin, et al.
Published: (2025)
Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
by: Yao, Linli, et al.
Published: (2024)
by: Yao, Linli, et al.
Published: (2024)
When Thinking Hurts: Mitigating Visual Forgetting in Video Reasoning via Frame Repetition
by: Sun, Xiaokun, et al.
Published: (2026)
by: Sun, Xiaokun, et al.
Published: (2026)
CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models
by: Chen, Meiqi, et al.
Published: (2024)
by: Chen, Meiqi, et al.
Published: (2024)
TIM: Teaching Large Language Models to Translate with Comparison
by: Zeng, Jiali, et al.
Published: (2023)
by: Zeng, Jiali, et al.
Published: (2023)
Retrieval-Augmented Machine Translation with Unstructured Knowledge
by: Wang, Jiaan, et al.
Published: (2024)
by: Wang, Jiaan, et al.
Published: (2024)
Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding
by: Zeng, Jiali, et al.
Published: (2023)
by: Zeng, Jiali, et al.
Published: (2023)
Language Generation with Strictly Proper Scoring Rules
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
SlangDIT: Benchmarking LLMs in Interpretative Slang Translation
by: Liang, Yunlong, et al.
Published: (2025)
by: Liang, Yunlong, et al.
Published: (2025)
Continuous Autoregressive Language Models
by: Shao, Chenze, et al.
Published: (2025)
by: Shao, Chenze, et al.
Published: (2025)
Similar Items
-
SpaceR: Reinforcing MLLMs in Video Spatial Reasoning
by: Ouyang, Kun, et al.
Published: (2025) -
PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
by: Ouyang, Kun, et al.
Published: (2024) -
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction
by: Wang, Yuchi, et al.
Published: (2025) -
Figure It Out: Improve the Frontier of Reasoning with Executable Visual States
by: Chen, Meiqi, et al.
Published: (2025) -
Video Understanding Reward Modeling: A Robust Benchmark and Performant Reward Models
by: Wei, Yuancheng, et al.
Published: (2026)