Saved in:
| Main Authors: | Yu, Chung-En Johnny, Jalaian, Brian, Bastian, Nathaniel D. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15435 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
by: Chung-En, et al.
Published: (2025)
by: Chung-En, et al.
Published: (2025)
Visual Reasoning Agent: Robust Vision Systems in Remote Sensing via Inference-Time Scaling
by: Yu, Chung-En Johnny, et al.
Published: (2025)
by: Yu, Chung-En Johnny, et al.
Published: (2025)
SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems
by: Yu, Chung-En Johnny, et al.
Published: (2026)
by: Yu, Chung-En Johnny, et al.
Published: (2026)
AIDE: Agentically Improve Visual Language Model with Domain Experts
by: Chiu, Ming-Chang, et al.
Published: (2025)
by: Chiu, Ming-Chang, et al.
Published: (2025)
Autonomous Computer Vision Development with Agentic AI
by: Kim, Jin, et al.
Published: (2025)
by: Kim, Jin, et al.
Published: (2025)
Concept-RuleNet: Grounded Multi-Agent Neurosymbolic Reasoning in Vision Language Models
by: Sinha, Sanchit, et al.
Published: (2025)
by: Sinha, Sanchit, et al.
Published: (2025)
Enhancing Agentic Autonomous Scientific Discovery with Vision-Language Model Capabilities
by: Gandhi, Kahaan, et al.
Published: (2025)
by: Gandhi, Kahaan, et al.
Published: (2025)
EH-Benchmark Ophthalmic Hallucination Benchmark and Agent-Driven Top-Down Traceable Reasoning Workflow
by: Pan, Xiaoyu, et al.
Published: (2025)
by: Pan, Xiaoyu, et al.
Published: (2025)
Gen-n-Val: Agentic Image Data Generation and Validation
by: Huang, Jing-En, et al.
Published: (2025)
by: Huang, Jing-En, et al.
Published: (2025)
An Agentic System for Rare Disease Diagnosis with Traceable Reasoning
by: Zhao, Weike, et al.
Published: (2025)
by: Zhao, Weike, et al.
Published: (2025)
Agentic-J: An AI Agent for Biological Microscopy Image Analysis
by: Johanns, Lukas, et al.
Published: (2026)
by: Johanns, Lukas, et al.
Published: (2026)
ReCCur: A Recursive Corner-Case Curation Framework for Robust Vision-Language Understanding in Open and Edge Scenarios
by: Wei, Yihan, et al.
Published: (2026)
by: Wei, Yihan, et al.
Published: (2026)
See it. Say it. Sorted: Agentic System for Compositional Diagram Generation
by: Zhang, Hantao, et al.
Published: (2025)
by: Zhang, Hantao, et al.
Published: (2025)
PhotoFlow: Agentic 3D Virtual Photography Missions
by: Guo, Jiarui, et al.
Published: (2026)
by: Guo, Jiarui, et al.
Published: (2026)
V-Agent: An Interactive Video Search System Using Vision-Language Models
by: Park, SunYoung, et al.
Published: (2025)
by: Park, SunYoung, et al.
Published: (2025)
ARGOS: Who, Where, and When in Agentic Multi-Camera Person Search
by: Kim, Myungchul, et al.
Published: (2026)
by: Kim, Myungchul, et al.
Published: (2026)
Chain-of-Anomaly Thoughts with Large Vision-Language Models
by: Domingos, Pedro, et al.
Published: (2025)
by: Domingos, Pedro, et al.
Published: (2025)
MAViS: A Multi-Agent Framework for Long-Sequence Video Storytelling
by: Wang, Qian, et al.
Published: (2025)
by: Wang, Qian, et al.
Published: (2025)
Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow
by: Yu, Xinlei, et al.
Published: (2025)
by: Yu, Xinlei, et al.
Published: (2025)
TRACE: A Self-Improving Framework for Robot Behavior Forecasting with Vision-Language Models
by: Puthumanaillam, Gokul, et al.
Published: (2025)
by: Puthumanaillam, Gokul, et al.
Published: (2025)
Metropolis-Hastings Captioning Game: Knowledge Fusion of Vision Language Models via Decentralized Bayesian Inference
by: Matsui, Yuta, et al.
Published: (2025)
by: Matsui, Yuta, et al.
Published: (2025)
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning
by: Song, Mingyang, et al.
Published: (2026)
by: Song, Mingyang, et al.
Published: (2026)
How Modality Shapes Perception and Reasoning: A Study of Error Propagation in ARC-AGI
by: Wen, Bo, et al.
Published: (2025)
by: Wen, Bo, et al.
Published: (2025)
An Artifact-based Agent Framework for Adaptive and Reproducible Medical Image Processing
by: Zuo, Lianrui, et al.
Published: (2026)
by: Zuo, Lianrui, et al.
Published: (2026)
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments
by: Kulinski, Sean, et al.
Published: (2024)
by: Kulinski, Sean, et al.
Published: (2024)
MedChat: A Multi-Agent Framework for Multimodal Diagnosis with Large Language Models
by: Liu, Philip R., et al.
Published: (2025)
by: Liu, Philip R., et al.
Published: (2025)
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement
by: Song, Yiwen, et al.
Published: (2026)
by: Song, Yiwen, et al.
Published: (2026)
OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
by: Lu, Pan, et al.
Published: (2025)
by: Lu, Pan, et al.
Published: (2025)
COMIC: Agentic Sketch Comedy Generation
by: Hong, Susung, et al.
Published: (2026)
by: Hong, Susung, et al.
Published: (2026)
MARIC: Multi-Agent Reasoning for Image Classification
by: Seo, Wonduk, et al.
Published: (2025)
by: Seo, Wonduk, et al.
Published: (2025)
Picking the Right Specialist: Attentive Neural Process-based Selection of Task-Specialized Models as Tools for Agentic Healthcare Systems
by: Saha, Pramit, et al.
Published: (2026)
by: Saha, Pramit, et al.
Published: (2026)
Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
by: Wang, Zehao, et al.
Published: (2026)
by: Wang, Zehao, et al.
Published: (2026)
Agentic Design Review System
by: Nag, Sayan, et al.
Published: (2025)
by: Nag, Sayan, et al.
Published: (2025)
RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
ProtoMedAgent: Multimodal Clinical Interpretability via Privacy-Aware Agentic Workflows
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)
by: Pellicer, Alvaro Lopez, et al.
Published: (2026)
GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents
by: Yu, Xi, et al.
Published: (2025)
by: Yu, Xi, et al.
Published: (2025)
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
by: Zhang, Hongxin, et al.
Published: (2024)
by: Zhang, Hongxin, et al.
Published: (2024)
VLM-Guided Iterative Refinement for Surgical Image Segmentation with Foundation Models
by: Lou, Ange, et al.
Published: (2026)
by: Lou, Ange, et al.
Published: (2026)
Local Prompt Adaptation for Style-Consistent Multi-Object Generation in Diffusion Models
by: Sanjyal, Ankit
Published: (2025)
by: Sanjyal, Ankit
Published: (2025)
Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning
by: Vlachos, Angelos, et al.
Published: (2025)
by: Vlachos, Angelos, et al.
Published: (2025)
Similar Items
-
Hydra: An Agentic Reasoning Approach for Enhancing Adversarial Robustness and Mitigating Hallucinations in Vision-Language Models
by: Chung-En, et al.
Published: (2025) -
Visual Reasoning Agent: Robust Vision Systems in Remote Sensing via Inference-Time Scaling
by: Yu, Chung-En Johnny, et al.
Published: (2025) -
SCoOP: Semantic Consistent Opinion Pooling for Uncertainty Quantification in Multiple Vision-Language Model Systems
by: Yu, Chung-En Johnny, et al.
Published: (2026) -
AIDE: Agentically Improve Visual Language Model with Domain Experts
by: Chiu, Ming-Chang, et al.
Published: (2025) -
Autonomous Computer Vision Development with Agentic AI
by: Kim, Jin, et al.
Published: (2025)