Saved in:
| Main Authors: | He, Xuming, Fan, Zehao, Li, Hengjia, Zhuo, Fan, Xu, Hankun, Cheng, Senlin, Weng, Di, Liu, Haifeng, Ye, Can, Wu, Boxi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.02622 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PhyRPR: Training-Free Physics-Constrained Video Generation
by: Zhao, Yibo, et al.
Published: (2026)
by: Zhao, Yibo, et al.
Published: (2026)
Searching Priors Makes Text-to-Video Synthesis Better
by: Cheng, Haoran, et al.
Published: (2024)
by: Cheng, Haoran, et al.
Published: (2024)
iRULER: Intelligible Rubric-Based User-Defined LLM Evaluation for Revision
by: Bai, Jingwen, et al.
Published: (2026)
by: Bai, Jingwen, et al.
Published: (2026)
RULER: Representation-Level Verification of Machine Unlearning
by: Cosma, Georgina, et al.
Published: (2026)
by: Cosma, Georgina, et al.
Published: (2026)
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
by: Li, Hengjia, et al.
Published: (2025)
by: Li, Hengjia, et al.
Published: (2025)
AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models
by: Mai, Zheda, et al.
Published: (2025)
by: Mai, Zheda, et al.
Published: (2025)
PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation
by: Li, Hengjia, et al.
Published: (2024)
by: Li, Hengjia, et al.
Published: (2024)
RULER: What's the Real Context Size of Your Long-Context Language Models?
by: Hsieh, Cheng-Ping, et al.
Published: (2024)
by: Hsieh, Cheng-Ping, et al.
Published: (2024)
RoadBench: A Vision-Language Foundation Model and Benchmark for Road Damage Understanding
by: Xiao, Xi, et al.
Published: (2025)
by: Xiao, Xi, et al.
Published: (2025)
Which LLMs Get the Joke? Probing Non-STEM Reasoning Abilities with HumorBench
by: Narad, Reuben, et al.
Published: (2025)
by: Narad, Reuben, et al.
Published: (2025)
ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing
by: Li, Hengjia, et al.
Published: (2026)
by: Li, Hengjia, et al.
Published: (2026)
GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents
by: Costarelli, Anthony, et al.
Published: (2024)
by: Costarelli, Anthony, et al.
Published: (2024)
HeartBench: Probing Core Dimensions of Anthropomorphic Intelligence in LLMs
by: Liu, Jiaxin, et al.
Published: (2025)
by: Liu, Jiaxin, et al.
Published: (2025)
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?
by: Liu, Yuanxin, et al.
Published: (2025)
by: Liu, Yuanxin, et al.
Published: (2025)
Sharp Eyes and Memory for VideoLLMs: Information-Aware Visual Token Pruning for Efficient and Reliable VideoLLM Reasoning
by: Qin, Jialong, et al.
Published: (2025)
by: Qin, Jialong, et al.
Published: (2025)
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
by: Gui, Jiayi, et al.
Published: (2024)
by: Gui, Jiayi, et al.
Published: (2024)
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models
by: Wu, Yongliang, et al.
Published: (2025)
by: Wu, Yongliang, et al.
Published: (2025)
Unveiling the Compositional Ability Gap in Vision-Language Reasoning Model
by: Li, Tianle, et al.
Published: (2025)
by: Li, Tianle, et al.
Published: (2025)
MagicView: Multi-View Consistent Identity Customization via Priors-Guided In-Context Learning
by: Li, Hengjia, et al.
Published: (2025)
by: Li, Hengjia, et al.
Published: (2025)
LocateBench: Evaluating the Locating Ability of Vision Language Models
by: Chiang, Ting-Rui, et al.
Published: (2024)
by: Chiang, Ting-Rui, et al.
Published: (2024)
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
by: Yu, Boxi, et al.
Published: (2025)
by: Yu, Boxi, et al.
Published: (2025)
STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence
by: Liu, Zihan, et al.
Published: (2025)
by: Liu, Zihan, et al.
Published: (2025)
Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
by: Yang, Cheng, et al.
Published: (2025)
by: Yang, Cheng, et al.
Published: (2025)
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning
by: Zhou, Yuhao, et al.
Published: (2025)
by: Zhou, Yuhao, et al.
Published: (2025)
Photoelectron superlubricity
by: Chen, Cheng, et al.
Published: (2024)
by: Chen, Cheng, et al.
Published: (2024)
SPT: Sequence Prompt Transformer for Interactive Image Segmentation
by: Cheng, Senlin, et al.
Published: (2024)
by: Cheng, Senlin, et al.
Published: (2024)
DeonticBench: A Benchmark for Reasoning over Rules
by: Dou, Guangyao, et al.
Published: (2026)
by: Dou, Guangyao, et al.
Published: (2026)
UniEditBench: A Unified and Cost-Effective Benchmark for Image and Video Editing via Distilled MLLMs
by: Jiang, Lifan, et al.
Published: (2026)
by: Jiang, Lifan, et al.
Published: (2026)
ViTime: Foundation Model for Time Series Forecasting Powered by Vision Intelligence
by: Yang, Luoxiao, et al.
Published: (2024)
by: Yang, Luoxiao, et al.
Published: (2024)
GDI-Bench: A Benchmark for General Document Intelligence with Vision and Reasoning Decoupling
by: Li, Siqi, et al.
Published: (2025)
by: Li, Siqi, et al.
Published: (2025)
Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning
by: Li, Rongjie, et al.
Published: (2024)
by: Li, Rongjie, et al.
Published: (2024)
Fostering Video Reasoning via Next-Event Prediction
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
by: Yang, Siwei, et al.
Published: (2024)
by: Yang, Siwei, et al.
Published: (2024)
What's Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning
by: Weng, Zhaotian, et al.
Published: (2025)
by: Weng, Zhaotian, et al.
Published: (2025)
Probing and Bridging Geometry-Interaction Cues for Affordance Reasoning in Vision Foundation Models
by: Zhang, Qing, et al.
Published: (2026)
by: Zhang, Qing, et al.
Published: (2026)
MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning
by: Tao, Sicheng, et al.
Published: (2025)
by: Tao, Sicheng, et al.
Published: (2025)
VLRS-Bench: A Vision-Language Reasoning Benchmark for Remote Sensing
by: Luo, Zhiming, et al.
Published: (2026)
by: Luo, Zhiming, et al.
Published: (2026)
Generating Graph-like Rules for Knowledge Graph Reasoning via Diffusion Models
by: Cheng, Haoxiang, et al.
Published: (2026)
by: Cheng, Haoxiang, et al.
Published: (2026)
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial Radiology
by: Huang, Xinrui, et al.
Published: (2025)
by: Huang, Xinrui, et al.
Published: (2025)
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
by: Zhao, Baining, et al.
Published: (2025)
by: Zhao, Baining, et al.
Published: (2025)
Similar Items
-
PhyRPR: Training-Free Physics-Constrained Video Generation
by: Zhao, Yibo, et al.
Published: (2026) -
Searching Priors Makes Text-to-Video Synthesis Better
by: Cheng, Haoran, et al.
Published: (2024) -
iRULER: Intelligible Rubric-Based User-Defined LLM Evaluation for Revision
by: Bai, Jingwen, et al.
Published: (2026) -
RULER: Representation-Level Verification of Machine Unlearning
by: Cosma, Georgina, et al.
Published: (2026) -
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
by: Li, Hengjia, et al.
Published: (2025)