Saved in:
| Main Authors: | Yu, Chung-En Johnny, Jalaian, Brian, Bastian, Nathaniel D. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.16343 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models
by: Yu, Chung-En Johnny, et al.
Published: (2025)
by: Yu, Chung-En Johnny, et al.
Published: (2025)
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
by: Chung-En, et al.
Published: (2025)
by: Chung-En, et al.
Published: (2025)
SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems
by: Yu, Chung-En Johnny, et al.
Published: (2026)
by: Yu, Chung-En Johnny, et al.
Published: (2026)
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
by: Yu, Xinlei, et al.
Published: (2025)
by: Yu, Xinlei, et al.
Published: (2025)
Learning Collective Dynamics of Multi-Agent Systems using Event-based Vision
by: Lee, Minah, et al.
Published: (2024)
by: Lee, Minah, et al.
Published: (2024)
AgentCVR: Active Multi-Agent Cross-Video Reasoning via Script-Simulated Reinforcement Learning
by: Qiu, Yilun, et al.
Published: (2026)
by: Qiu, Yilun, et al.
Published: (2026)
Concept-RuleNet: Grounded Multi-Agent Neurosymbolic Reasoning in Vision Language Models
by: Sinha, Sanchit, et al.
Published: (2025)
by: Sinha, Sanchit, et al.
Published: (2025)
DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems
by: Bird, Joshua, et al.
Published: (2025)
by: Bird, Joshua, et al.
Published: (2025)
MAG-3D: Multi-Agent Grounded Reasoning for 3D Understanding
by: Zheng, Henry, et al.
Published: (2026)
by: Zheng, Henry, et al.
Published: (2026)
A Multi-Agent Perception-Action Alliance for Efficient Long Video Reasoning
by: Xu, Yichang, et al.
Published: (2026)
by: Xu, Yichang, et al.
Published: (2026)
FetalAgents: A Multi-Agent System for Fetal Ultrasound Image and Video Analysis
by: Hu, Xiaotian, et al.
Published: (2026)
by: Hu, Xiaotian, et al.
Published: (2026)
Enhancing CLIP Robustness via Cross-Modality Alignment
by: Zhu, Xingyu, et al.
Published: (2025)
by: Zhu, Xingyu, et al.
Published: (2025)
ReCCur: A Recursive Corner-Case Curation Framework for Robust Vision-Language Understanding in Open and Edge Scenarios
by: Wei, Yihan, et al.
Published: (2026)
by: Wei, Yihan, et al.
Published: (2026)
VideoChat-M1: Collaborative Policy Planning for Video Understanding via Multi-Agent Reinforcement Learning
by: Chen, Boyu, et al.
Published: (2025)
by: Chen, Boyu, et al.
Published: (2025)
V-Agent: An Interactive Video Search System Using Vision-Language Models
by: Park, SunYoung, et al.
Published: (2025)
by: Park, SunYoung, et al.
Published: (2025)
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
MARIC: Multi-Agent Reasoning for Image Classification
by: Seo, Wonduk, et al.
Published: (2025)
by: Seo, Wonduk, et al.
Published: (2025)
Hollywood Town: Long-Video Generation via Cross-Modal Multi-Agent Orchestration
by: Wei, Zheng, et al.
Published: (2025)
by: Wei, Zheng, et al.
Published: (2025)
RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning
by: Vlachos, Angelos, et al.
Published: (2025)
by: Vlachos, Angelos, et al.
Published: (2025)
EH-Benchmark Ophthalmic Hallucination Benchmark and Agent-Driven Top-Down Traceable Reasoning Workflow
by: Pan, Xiaoyu, et al.
Published: (2025)
by: Pan, Xiaoyu, et al.
Published: (2025)
AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation
by: Fathi, Nima, et al.
Published: (2025)
by: Fathi, Nima, et al.
Published: (2025)
A Case Study of Counting the Number of Unique Users in Linear and Non-Linear Trails -- A Multi-Agent System Approach
by: Rahman, Tanvir
Published: (2025)
by: Rahman, Tanvir
Published: (2025)
VideoMultiAgents: A Multi-Agent Framework for Video Question Answering
by: Kugo, Noriyuki, et al.
Published: (2025)
by: Kugo, Noriyuki, et al.
Published: (2025)
AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees
by: Li, Yuankai, et al.
Published: (2026)
by: Li, Yuankai, et al.
Published: (2026)
Scene-Aware Vectorized Memory Multi-Agent Framework with Cross-Modal Differentiated Quantization VLMs for Visually Impaired Assistance
by: Wang, Xiangxiang, et al.
Published: (2025)
by: Wang, Xiangxiang, et al.
Published: (2025)
Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration
by: Hu, Xiaotian, et al.
Published: (2026)
by: Hu, Xiaotian, et al.
Published: (2026)
OralAgent: Integrating Reasoning, Tools, and Knowledge for Interactive Dental Image Analysis
by: Hao, Jing, et al.
Published: (2026)
by: Hao, Jing, et al.
Published: (2026)
Sentinel: Embodied Cooperative Spatial Reasoning and Planning
by: Lin, Xiangye, et al.
Published: (2026)
by: Lin, Xiangye, et al.
Published: (2026)
Chain-of-Anomaly Thoughts with Large Vision-Language Models
by: Domingos, Pedro, et al.
Published: (2025)
by: Domingos, Pedro, et al.
Published: (2025)
A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature
by: Chen, Yufan, et al.
Published: (2025)
by: Chen, Yufan, et al.
Published: (2025)
Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
by: Matsui, Yuta, et al.
Published: (2025)
by: Matsui, Yuta, et al.
Published: (2025)
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
by: Hu, Panwen, et al.
Published: (2024)
by: Hu, Panwen, et al.
Published: (2024)
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
by: Kulinski, Sean, et al.
Published: (2024)
by: Kulinski, Sean, et al.
Published: (2024)
Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks
by: Kar, Indrajit, et al.
Published: (2025)
by: Kar, Indrajit, et al.
Published: (2025)
LongVideoAgent: Multi-Agent Reasoning with Long Videos
by: Liu, Runtao, et al.
Published: (2025)
by: Liu, Runtao, et al.
Published: (2025)
Think Before You Segment: An Object-aware Reasoning Agent for Referring Audio-Visual Segmentation
by: Zhou, Jinxing, et al.
Published: (2025)
by: Zhou, Jinxing, et al.
Published: (2025)
SPACE: 3D Spatial Co-operation and Exploration Framework for Robust Mapping and Coverage with Multi-Robot Systems
by: Ghanta, Sai Krishna, et al.
Published: (2024)
by: Ghanta, Sai Krishna, et al.
Published: (2024)
Visual Sensor Pose Optimisation Using Visibility Models for Smart Cities
by: Arnold, Eduardo, et al.
Published: (2021)
by: Arnold, Eduardo, et al.
Published: (2021)
AstroVLM: Expert Multi-agent Collaborative Reasoning for Astronomical Imaging Quality Diagnosis
by: Han, Yaohui, et al.
Published: (2026)
by: Han, Yaohui, et al.
Published: (2026)
Similar Items
-
ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models
by: Yu, Chung-En Johnny, et al.
Published: (2025) -
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
by: Chung-En, et al.
Published: (2025) -
SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems
by: Yu, Chung-En Johnny, et al.
Published: (2026) -
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
by: Yu, Xinlei, et al.
Published: (2025) -
Learning Collective Dynamics of Multi-Agent Systems using Event-based Vision
by: Lee, Minah, et al.
Published: (2024)