Saved in:
| Main Authors: | Gu, Ying, Leong, Mei Chee, Tan, Hui Li, Mao, Shangbo, Li, Liyuan, Chen, Nancy |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.06201 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks
by: Leong, Mei Chee, et al.
Published: (2026)
by: Leong, Mei Chee, et al.
Published: (2026)
Towards Self-Refinement of Vision-Language Models with Triangular Consistency
by: Deng, Yunlong, et al.
Published: (2025)
by: Deng, Yunlong, et al.
Published: (2025)
GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic
by: Zhang, Tianyuan, et al.
Published: (2026)
by: Zhang, Tianyuan, et al.
Published: (2026)
Self-Evolving Spatial Reasoning in Vision Language Models via Geometric Logic Consistency
by: Liu, Junming, et al.
Published: (2026)
by: Liu, Junming, et al.
Published: (2026)
FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning
by: Zou, Jinzi, et al.
Published: (2026)
by: Zou, Jinzi, et al.
Published: (2026)
From Training-Free to Adaptive: Empirical Insights into MLLMs' Understanding of Detection Information
by: Jiao, Qirui, et al.
Published: (2024)
by: Jiao, Qirui, et al.
Published: (2024)
Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
by: Liu, Junming, et al.
Published: (2025)
by: Liu, Junming, et al.
Published: (2025)
Toward Dependency Dynamics in Multi-Agent Reinforcement Learning for Traffic Signal Control
by: Zhang, Yuli, et al.
Published: (2025)
by: Zhang, Yuli, et al.
Published: (2025)
Aligning Crowd-sourced Human Feedback for Reinforcement Learning on Code Generation by Large Language Models
by: Wong, Man Fai, et al.
Published: (2025)
by: Wong, Man Fai, et al.
Published: (2025)
Towards Scalable Web Accessibility Audit with MLLMs as Copilots
by: Gu, Ming, et al.
Published: (2025)
by: Gu, Ming, et al.
Published: (2025)
Target-specific Adaptation and Consistent Degradation Alignment for Cross-Domain Remaining Useful Life Prediction
by: Hou, Yubo, et al.
Published: (2025)
by: Hou, Yubo, et al.
Published: (2025)
Communication Strategy on Macro-and-Micro Traffic State in Cooperative Deep Reinforcement Learning for Regional Traffic Signal Control
by: Gu, Hankang, et al.
Published: (2025)
by: Gu, Hankang, et al.
Published: (2025)
MME-Reasoning: A Comprehensive Benchmark for Logical Reasoning in MLLMs
by: Yuan, Jiakang, et al.
Published: (2025)
by: Yuan, Jiakang, et al.
Published: (2025)
CrossCult-KIBench: A Benchmark for Cross-Cultural Knowledge Insertion in MLLMs
by: Zeng, Zhen, et al.
Published: (2026)
by: Zeng, Zhen, et al.
Published: (2026)
FITRep: Attention-Guided Item Representation via MLLMs
by: Zhang, Guoxiao, et al.
Published: (2025)
by: Zhang, Guoxiao, et al.
Published: (2025)
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs
by: Li, Caorui, et al.
Published: (2025)
by: Li, Caorui, et al.
Published: (2025)
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks
by: Feng, Yu, et al.
Published: (2025)
by: Feng, Yu, et al.
Published: (2025)
Quantifying Multimodal Capabilities: Formal Generalization Guarantees in Pairwise Metric Learning
by: Zhou, Richeng, et al.
Published: (2026)
by: Zhou, Richeng, et al.
Published: (2026)
Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators
by: Gu, Feng, et al.
Published: (2025)
by: Gu, Feng, et al.
Published: (2025)
Banking Done Right: Redefining Retail Banking with Language-Centric AI
by: Chua, Xin Jie, et al.
Published: (2025)
by: Chua, Xin Jie, et al.
Published: (2025)
Self-Consistency as a Free Lunch: Reducing Hallucinations in Vision-Language Models via Self-Reflection
by: Han, Mingfei, et al.
Published: (2025)
by: Han, Mingfei, et al.
Published: (2025)
AceWGS: An LLM-Aided Framework to Accelerate Catalyst Design for Water-Gas Shift Reactions
by: Chattoraj, Joyjit, et al.
Published: (2025)
by: Chattoraj, Joyjit, et al.
Published: (2025)
A Medical Multimodal Diagnostic Framework Integrating Vision-Language Models and Logic Tree Reasoning
by: Zang, Zelin, et al.
Published: (2025)
by: Zang, Zelin, et al.
Published: (2025)
Coarse-to-Fine Personalized LLM Impressions for Streamlined Radiology Reports
by: Sun, Chengbo, et al.
Published: (2025)
by: Sun, Chengbo, et al.
Published: (2025)
Affordance Benchmark for MLLMs
by: Wang, Junying, et al.
Published: (2025)
by: Wang, Junying, et al.
Published: (2025)
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
by: Mei, Guofeng, et al.
Published: (2024)
by: Mei, Guofeng, et al.
Published: (2024)
RynnEC: Bringing MLLMs into Embodied World
by: Dang, Ronghao, et al.
Published: (2025)
by: Dang, Ronghao, et al.
Published: (2025)
Integrating Vehicle Acoustic Data for Enhanced Urban Traffic Management: A Study on Speed Classification in Suzhou
by: Fan, Pengfei, et al.
Published: (2025)
by: Fan, Pengfei, et al.
Published: (2025)
Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning
by: Li, Rongjin, et al.
Published: (2026)
by: Li, Rongjin, et al.
Published: (2026)
StrLoRA: Towards Streaming Continual Visual Instruction Tuning for MLLMs
by: Che, Chang, et al.
Published: (2026)
by: Che, Chang, et al.
Published: (2026)
V-Reflection: Transforming MLLMs from Passive Observers to Active Interrogators
by: Zhou, Jiazhou, et al.
Published: (2026)
by: Zhou, Jiazhou, et al.
Published: (2026)
Metric Dynamic Equilibrium Logic
by: Becker, Arvid, et al.
Published: (2024)
by: Becker, Arvid, et al.
Published: (2024)
Multi-Dimensional Prompt Chaining to Improve Open-Domain Dialogue Generation
by: Teng, Livia Leong Hui
Published: (2026)
by: Teng, Livia Leong Hui
Published: (2026)
Fine-tuning Pre-trained Vision-Language Models in a Human-Annotation-Free Manner
by: Wang, Qian-Wei, et al.
Published: (2026)
by: Wang, Qian-Wei, et al.
Published: (2026)
Characterizing Vision-Language-Action Models across XPUs: Constraints and Acceleration for On-Robot Deployment
by: Zhou, Kaijun, et al.
Published: (2026)
by: Zhou, Kaijun, et al.
Published: (2026)
Redundancy Principles for MLLMs Benchmarks
by: Zhang, Zicheng, et al.
Published: (2025)
by: Zhang, Zicheng, et al.
Published: (2025)
Cross-Lingual Consistency: A Novel Inference Framework for Advancing Reasoning in Large Language Models
by: Yu, Zhiwei, et al.
Published: (2025)
by: Yu, Zhiwei, et al.
Published: (2025)
TOFA: Training-Free One-Shot Federated Adaptation for Vision-Language Models
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Logic Agent: Enhancing Validity with Logic Rule Invocation
by: Liu, Hanmeng, et al.
Published: (2024)
by: Liu, Hanmeng, et al.
Published: (2024)
Revisiting Service Level Objectives and System Level Metrics in Large Language Model Serving
by: Wang, Zhibin, et al.
Published: (2024)
by: Wang, Zhibin, et al.
Published: (2024)
Similar Items
-
Explicit Logic Channel for Validation and Enhancement of MLLMs on Zero-Shot Tasks
by: Leong, Mei Chee, et al.
Published: (2026) -
Towards Self-Refinement of Vision-Language Models with Triangular Consistency
by: Deng, Yunlong, et al.
Published: (2025) -
GuardAD: Safeguarding Autonomous Driving MLLMs via Markovian Safety Logic
by: Zhang, Tianyuan, et al.
Published: (2026) -
Self-Evolving Spatial Reasoning in Vision Language Models via Geometric Logic Consistency
by: Liu, Junming, et al.
Published: (2026) -
FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning
by: Zou, Jinzi, et al.
Published: (2026)