Saved in:
| Main Authors: | Du, Bodong, Liu, Bowen, Yu, Yang, Ding, Xinpeng, Wu, Zhiheng, Wang, Shuning, Nie, Shuo, Liu, Naiming, Chen, Qifeng, Song, Yangqiu, Li, Xiaomeng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.06537 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Divide-then-Diagnose: Weaving Clinician-Inspired Contexts for Ultra-Long Capsule Endoscopy Videos
by: Liu, Bowen, et al.
Published: (2026)
by: Liu, Bowen, et al.
Published: (2026)
RadHiera: Semantic Hierarchical Reinforcement Learning for Medical Report Generation
by: Du, Bodong, et al.
Published: (2025)
by: Du, Bodong, et al.
Published: (2025)
See Further, Think Deeper: Advancing VLM's Reasoning Ability with Low-level Visual Cues and Reflection
by: Wu, Zhiheng, et al.
Published: (2026)
by: Wu, Zhiheng, et al.
Published: (2026)
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration
by: Yang, Honglong, et al.
Published: (2025)
by: Yang, Honglong, et al.
Published: (2025)
Distribution-Aware Reward Estimation for Test-Time Reinforcement Learning
by: Du, Bodong, et al.
Published: (2026)
by: Du, Bodong, et al.
Published: (2026)
Subgraph Aggregation for Out-of-Distribution Generalization on Graphs
by: Liu, Bowen, et al.
Published: (2024)
by: Liu, Bowen, et al.
Published: (2024)
WildLMa: Long Horizon Loco-Manipulation in the Wild
by: Qiu, Ri-Zhao, et al.
Published: (2024)
by: Qiu, Ri-Zhao, et al.
Published: (2024)
Tri-Plane Mamba: Efficiently Adapting Segment Anything Model for 3D Medical Images
by: Wang, Hualiang, et al.
Published: (2024)
by: Wang, Hualiang, et al.
Published: (2024)
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
by: Zuo, Yuxin, et al.
Published: (2025)
by: Zuo, Yuxin, et al.
Published: (2025)
QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence
by: Lin, Zhichao, et al.
Published: (2026)
by: Lin, Zhichao, et al.
Published: (2026)
PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning
by: Ding, Xinpeng, et al.
Published: (2025)
by: Ding, Xinpeng, et al.
Published: (2025)
Spatially Grounded Long-Horizon Task Planning in the Wild
by: Jung, Sehun, et al.
Published: (2026)
by: Jung, Sehun, et al.
Published: (2026)
AMemGym: Interactive Memory Benchmarking for Assistants in Long-Horizon Conversations
by: Jiayang, Cheng, et al.
Published: (2026)
by: Jiayang, Cheng, et al.
Published: (2026)
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models
by: Ding, Xinpeng, et al.
Published: (2024)
by: Ding, Xinpeng, et al.
Published: (2024)
WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception
by: Liu, Zhiheng, et al.
Published: (2025)
by: Liu, Zhiheng, et al.
Published: (2025)
LongVideoAgent: Multi-Agent Reasoning with Long Videos
by: Liu, Runtao, et al.
Published: (2025)
by: Liu, Runtao, et al.
Published: (2025)
Deep Bayesian Reinforcement Learning for Spacecraft Proximity Maneuvers and Docking
by: Du, Desong, et al.
Published: (2023)
by: Du, Desong, et al.
Published: (2023)
Towards Subgraph Isomorphism Counting with Graph Kernels
by: Liu, Xin, et al.
Published: (2024)
by: Liu, Xin, et al.
Published: (2024)
Channel Modeling and Rate Analysis of Optical Inter-Satellite Link (OISL)
by: Shang, Bodong, et al.
Published: (2025)
by: Shang, Bodong, et al.
Published: (2025)
MedMT-Bench: Can LLMs Memorize and Understand Long Multi-Turn Conversations in Medical Scenarios?
by: Yang, Lin, et al.
Published: (2026)
by: Yang, Lin, et al.
Published: (2026)
ST-SimDiff: Balancing Spatiotemporal Similarity and Difference for Efficient Video Understanding with MLLMs
by: Luo, Bingjun, et al.
Published: (2026)
by: Luo, Bingjun, et al.
Published: (2026)
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation
by: Ding, Shuangrui, et al.
Published: (2026)
by: Ding, Shuangrui, et al.
Published: (2026)
Beyond Tools: Generative AI as Epistemic Infrastructure in Education
by: Chen, Bodong
Published: (2025)
by: Chen, Bodong
Published: (2025)
1st Place Winner of the 2024 Pixel-level Video Understanding in the Wild (CVPR'24 PVUW) Challenge in Video Panoptic Segmentation and Best Long Video Consistency of Video Semantic Segmentation
by: Liu, Qingfeng, et al.
Published: (2024)
by: Liu, Qingfeng, et al.
Published: (2024)
The Cognitive Bandwidth Bottleneck: Shifting Long-Horizon Agent from Planning with Actions to Planning with Schemas
by: Xu, Baixuan, et al.
Published: (2025)
by: Xu, Baixuan, et al.
Published: (2025)
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding
by: Wu, Haoning, et al.
Published: (2024)
by: Wu, Haoning, et al.
Published: (2024)
MedGround-R1: Advancing Medical Image Grounding via Spatial-Semantic Rewarded Group Relative Policy Optimization
by: Xu, Huihui, et al.
Published: (2025)
by: Xu, Huihui, et al.
Published: (2025)
MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding
by: Su, Yuhao, et al.
Published: (2025)
by: Su, Yuhao, et al.
Published: (2025)
MedCoT: Medical Chain of Thought via Hierarchical Expert
by: Liu, Jiaxiang, et al.
Published: (2024)
by: Liu, Jiaxiang, et al.
Published: (2024)
Towards Event-oriented Long Video Understanding
by: Du, Yifan, et al.
Published: (2024)
by: Du, Yifan, et al.
Published: (2024)
PyraVid: Hierarchical Multimodal Memory for Long-Horizon Video Reasoning
by: Yan, Sikuan, et al.
Published: (2026)
by: Yan, Sikuan, et al.
Published: (2026)
OmniV-Med: Scaling Medical Vision-Language Model for Universal Visual Understanding
by: Jiang, Songtao, et al.
Published: (2025)
by: Jiang, Songtao, et al.
Published: (2025)
InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning
by: Liu, Zeyu, et al.
Published: (2025)
by: Liu, Zeyu, et al.
Published: (2025)
Token Activation Map to Visually Explain Multimodal LLMs
by: Li, Yi, et al.
Published: (2025)
by: Li, Yi, et al.
Published: (2025)
HiLM-D: Enhancing MLLMs with Multi-Scale High-Resolution Details for Autonomous Driving
by: Ding, Xinpeng, et al.
Published: (2023)
by: Ding, Xinpeng, et al.
Published: (2023)
MedSapiens: Taking a Pose to Rethink Medical Imaging Landmark Detection
by: Elbatel, Marawan, et al.
Published: (2025)
by: Elbatel, Marawan, et al.
Published: (2025)
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling
by: Tian, Zeyue, et al.
Published: (2024)
by: Tian, Zeyue, et al.
Published: (2024)
Synthetic Context Generation for Question Generation
by: Liu, Naiming, et al.
Published: (2024)
by: Liu, Naiming, et al.
Published: (2024)
Circuit Complexity of Hierarchical Knowledge Tracing and Implications for Log-Precision Transformers
by: Liu, Naiming, et al.
Published: (2026)
by: Liu, Naiming, et al.
Published: (2026)
MetaCLASS: Metacognitive Coaching for Learning with Adaptive Self-regulation Support
by: Liu, Naiming, et al.
Published: (2026)
by: Liu, Naiming, et al.
Published: (2026)
Similar Items
-
Divide-then-Diagnose: Weaving Clinician-Inspired Contexts for Ultra-Long Capsule Endoscopy Videos
by: Liu, Bowen, et al.
Published: (2026) -
RadHiera: Semantic Hierarchical Reinforcement Learning for Medical Report Generation
by: Du, Bodong, et al.
Published: (2025) -
See Further, Think Deeper: Advancing VLM's Reasoning Ability with Low-level Visual Cues and Reflection
by: Wu, Zhiheng, et al.
Published: (2026) -
Multi-Modal Explainable Medical AI Assistant for Trustworthy Human-AI Collaboration
by: Yang, Honglong, et al.
Published: (2025) -
Distribution-Aware Reward Estimation for Test-Time Reinforcement Learning
by: Du, Bodong, et al.
Published: (2026)