:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Chung-En Johnny, Jalaian, Brian, Bastian, Nathaniel D.
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Multiagent Systems
Online Access:	https://arxiv.org/abs/2509.15435
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
by: Chung-En, et al.
Published: (2025)

Visual Reasoning Agent: Robust Vision Systems in Remote Sensing via Inference-Time Scaling
by: Yu, Chung-En Johnny, et al.
Published: (2025)

SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems
by: Yu, Chung-En Johnny, et al.
Published: (2026)

AIDE: Agentically Improve Visual Language Model with Domain Experts
by: Chiu, Ming-Chang, et al.
Published: (2025)

Autonomous Computer Vision Development with Agentic AI
by: Kim, Jin, et al.
Published: (2025)

Concept-RuleNet: Grounded Multi-Agent Neurosymbolic Reasoning in Vision Language Models
by: Sinha, Sanchit, et al.
Published: (2025)

Enhancing Agentic Autonomous Scientific Discovery with Vision-Language Model Capabilities
by: Gandhi, Kahaan, et al.
Published: (2025)

EH-Benchmark Ophthalmic Hallucination Benchmark and Agent-Driven Top-Down Traceable Reasoning Workflow
by: Pan, Xiaoyu, et al.
Published: (2025)

Gen-n-Val: Agentic Image Data Generation and Validation
by: Huang, Jing-En, et al.
Published: (2025)

An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
by: Zhao, Weike, et al.
Published: (2025)

Agentic-J: An AI Agent for Biological Microscopy Image Analysis
by: Johanns, Lukas, et al.
Published: (2026)

ReCCur: A Recursive Corner-Case Curation Framework for Robust Vision-Language Understanding in Open and Edge Scenarios
by: Wei, Yihan, et al.
Published: (2026)

See it. Say it. Sorted: Agentic System for Compositional Diagram Generation
by: Zhang, Hantao, et al.
Published: (2025)

PhotoFlow: Agentic 3D Virtual Photography Missions
by: Guo, Jiarui, et al.
Published: (2026)

V-Agent: An Interactive Video Search System Using Vision-Language Models
by: Park, SunYoung, et al.
Published: (2025)

ARGOS: Who, Where, and When in Agentic Multi-Camera Person Search
by: Kim, Myungchul, et al.
Published: (2026)

Chain-of-Anomaly Thoughts with Large Vision-Language Models
by: Domingos, Pedro, et al.
Published: (2025)

MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
by: Wang, Qian, et al.
Published: (2025)

Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
by: Yu, Xinlei, et al.
Published: (2025)

TRACE: A Self-Improving Framework for Robot Behavior Forecasting with Vision-Language Models
by: Puthumanaillam, Gokul, et al.
Published: (2025)

Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
by: Matsui, Yuta, et al.
Published: (2025)

AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning
by: Song, Mingyang, et al.
Published: (2026)

How Modality Shapes Perception and Reasoning: A Study of Error Propagation in ARC-AGI
by: Wen, Bo, et al.
Published: (2025)

An Artifact-based Agent Framework for Adaptive and Reproducible Medical Image Processing
by: Zuo, Lianrui, et al.
Published: (2026)

StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
by: Kulinski, Sean, et al.
Published: (2024)

MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models
by: Liu, Philip R., et al.
Published: (2025)

VQQA: An Agentic Approach for Video Evaluation and Quality Improvement
by: Song, Yiwen, et al.
Published: (2026)

OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
by: Lu, Pan, et al.
Published: (2025)

COMIC: Agentic Sketch Comedy Generation
by: Hong, Susung, et al.
Published: (2026)

MARIC: Multi-Agent Reasoning for Image Classification
by: Seo, Wonduk, et al.
Published: (2025)

Picking the Right Specialist: Attentive Neural Process-based Selection of Task-Specialized Models as Tools for Agentic Healthcare Systems
by: Saha, Pramit, et al.
Published: (2026)

Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
by: Wang, Zehao, et al.
Published: (2026)

Agentic Design Review System
by: Nag, Sayan, et al.
Published: (2025)

RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
by: Zhang, Kai, et al.
Published: (2025)

ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)

GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents
by: Yu, Xi, et al.
Published: (2025)

COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
by: Zhang, Hongxin, et al.
Published: (2024)

VLM-Guided Iterative Refinement for Surgical Image Segmentation with Foundation Models
by: Lou, Ange, et al.
Published: (2026)

Local Prompt Adaptation for Style-Consistent Multi-Object Generation in Diffusion Models
by: Sanjyal, Ankit
Published: (2025)

Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning
by: Vlachos, Angelos, et al.
Published: (2025)