Saved in:
| Main Authors: | Tang, Yi, Wang, Kai-Ni, Chen, Yang, He, Xiaopu, Zhou, Guangquan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.07292 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models
by: Dai, Xuanlang, et al.
Published: (2026)
by: Dai, Xuanlang, et al.
Published: (2026)
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
by: Zhao, Weike, et al.
Published: (2025)
by: Zhao, Weike, et al.
Published: (2025)
InternAgent: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
by: InternAgent Team, et al.
Published: (2025)
by: InternAgent Team, et al.
Published: (2025)
RadFabric: Agentic AI System with Reasoning Capability for Radiology
by: Chen, Wenting, et al.
Published: (2025)
by: Chen, Wenting, et al.
Published: (2025)
CogniAlign: Survivability-Grounded Multi-Agent Moral Reasoning for Safe and Transparent AI
by: Ali, Hasin Jawad, et al.
Published: (2025)
by: Ali, Hasin Jawad, et al.
Published: (2025)
RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
Agent-X: Evaluating Deep Multimodal Reasoning in Vision-Centric Agentic Tasks
by: Ashraf, Tajamul, et al.
Published: (2025)
by: Ashraf, Tajamul, et al.
Published: (2025)
EndoStreamDepth: Temporally Consistent Monocular Depth Estimation for Endoscopic Video Streams
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
History-Aware Reasoning for GUI Agents
by: Wang, Ziwei, et al.
Published: (2025)
by: Wang, Ziwei, et al.
Published: (2025)
Visual Merit or Linguistic Crutch? A Close Look at DeepSeek-OCR
by: Liang, Yunhao, et al.
Published: (2026)
by: Liang, Yunhao, et al.
Published: (2026)
ReLoop: "Seeing Twice and Thinking Backwards" via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding
by: Yang, Jianjiang, et al.
Published: (2025)
by: Yang, Jianjiang, et al.
Published: (2025)
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
by: Lu, Meng, et al.
Published: (2025)
by: Lu, Meng, et al.
Published: (2025)
VideoARM: Agentic Reasoning over Hierarchical Memory for Long-Form Video Understanding
by: Yin, Yufei, et al.
Published: (2025)
by: Yin, Yufei, et al.
Published: (2025)
From Consistency to Complementarity: Aligned and Disentangled Multi-modal Learning for Time Series Understanding and Reasoning
by: Ni, Hang, et al.
Published: (2026)
by: Ni, Hang, et al.
Published: (2026)
Sherlock: Self-Correcting Reasoning in Vision-Language Models
by: Ding, Yi, et al.
Published: (2025)
by: Ding, Yi, et al.
Published: (2025)
EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
by: Wang, Guankun, et al.
Published: (2025)
by: Wang, Guankun, et al.
Published: (2025)
Diving into Self-Evolving Training for Multimodal Reasoning
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation
by: Qin, Zhi, et al.
Published: (2025)
by: Qin, Zhi, et al.
Published: (2025)
EndoGaussian: Real-time Gaussian Splatting for Dynamic Endoscopic Scene Reconstruction
by: Liu, Yifan, et al.
Published: (2024)
by: Liu, Yifan, et al.
Published: (2024)
Agent S: An Open Agentic Framework that Uses Computers Like a Human
by: Agashe, Saaket, et al.
Published: (2024)
by: Agashe, Saaket, et al.
Published: (2024)
Agri-R1: Agricultural Reasoning for Disease Diagnosis via Automated-Synthesis and Reinforcement Learning
by: Zhang, Wentao, et al.
Published: (2026)
by: Zhang, Wentao, et al.
Published: (2026)
EndoWave: Rational-Wavelet 4D Gaussian Splatting for Endoscopic Reconstruction
by: Wu, Taoyu, et al.
Published: (2025)
by: Wu, Taoyu, et al.
Published: (2025)
GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
by: Chen, Yi, et al.
Published: (2025)
by: Chen, Yi, et al.
Published: (2025)
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models
by: Zhou, Andy, et al.
Published: (2023)
by: Zhou, Andy, et al.
Published: (2023)
Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs
by: Zheng, Huan, et al.
Published: (2026)
by: Zheng, Huan, et al.
Published: (2026)
rStar2-Agent: Agentic Reasoning Technical Report
by: Shang, Ning, et al.
Published: (2025)
by: Shang, Ning, et al.
Published: (2025)
PresentAgent-2: Towards Generalist Multimodal Presentation Agents
by: Wu, Wei, et al.
Published: (2026)
by: Wu, Wei, et al.
Published: (2026)
EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting
by: Zhu, Lingting, et al.
Published: (2024)
by: Zhu, Lingting, et al.
Published: (2024)
CycleChart: A Unified Consistency-Based Learning Framework for Bidirectional Chart Understanding and Generation
by: Deng, Dazhen, et al.
Published: (2025)
by: Deng, Dazhen, et al.
Published: (2025)
EndoGen: Conditional Autoregressive Endoscopic Video Generation
by: Liu, Xinyu, et al.
Published: (2025)
by: Liu, Xinyu, et al.
Published: (2025)
Hierarchical Visual Agent: Managing Contexts in Joint Image-Text Space for Advanced Chart Reasoning
by: Dong, Qihua, et al.
Published: (2026)
by: Dong, Qihua, et al.
Published: (2026)
EndoDepth: A Benchmark for Assessing Robustness in Endoscopic Depth Prediction
by: Reyes-Amezcua, Ivan, et al.
Published: (2024)
by: Reyes-Amezcua, Ivan, et al.
Published: (2024)
Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison
by: Yang, Qian, et al.
Published: (2024)
by: Yang, Qian, et al.
Published: (2024)
MARS: Memory-Enhanced Agents with Reflective Self-improvement
by: Liang, Xuechen, et al.
Published: (2025)
by: Liang, Xuechen, et al.
Published: (2025)
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models
by: Wu, Junfei, et al.
Published: (2024)
by: Wu, Junfei, et al.
Published: (2024)
End-to-End Agentic RAG System Training for Traceable Diagnostic Reasoning
by: Zheng, Qiaoyu, et al.
Published: (2025)
by: Zheng, Qiaoyu, et al.
Published: (2025)
ATLAS: Agentic or Latent Visual Reasoning? One Word is Enough for Both
by: Guo, Ziyu, et al.
Published: (2026)
by: Guo, Ziyu, et al.
Published: (2026)
CogniFold: Always-On Proactive Memory via Cognitive Folding
by: Wang, Suli, et al.
Published: (2026)
by: Wang, Suli, et al.
Published: (2026)
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
by: Zheng, Kaizhi, et al.
Published: (2022)
by: Zheng, Kaizhi, et al.
Published: (2022)
EndoMamba: An Efficient Foundation Model for Endoscopic Videos via Hierarchical Pre-training
by: Tian, Qingyao, et al.
Published: (2025)
by: Tian, Qingyao, et al.
Published: (2025)
Similar Items
-
EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models
by: Dai, Xuanlang, et al.
Published: (2026) -
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
by: Zhao, Weike, et al.
Published: (2025) -
InternAgent: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
by: InternAgent Team, et al.
Published: (2025) -
RadFabric: Agentic AI System with Reasoning Capability for Radiology
by: Chen, Wenting, et al.
Published: (2025) -
CogniAlign: Survivability-Grounded Multi-Agent Moral Reasoning for Safe and Transparent AI
by: Ali, Hasin Jawad, et al.
Published: (2025)