Saved in:
Bibliographic Details
Main Authors: Zheng, Heng, Shi, Yuling, Gu, Xiaodong, You, Haochen, Zhang, Zijian, Gan, Lubin, Zhang, Hao, Huang, Wenjun, Huang, Jin
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2511.00908
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915689489498112
author Zheng, Heng
Shi, Yuling
Gu, Xiaodong
You, Haochen
Zhang, Zijian
Gan, Lubin
Zhang, Hao
Huang, Wenjun
Huang, Jin
author_facet Zheng, Heng
Shi, Yuling
Gu, Xiaodong
You, Haochen
Zhang, Zijian
Gan, Lubin
Zhang, Hao
Huang, Wenjun
Huang, Jin
contents Visual geo-localization requires extensive geographic knowledge and sophisticated reasoning to determine image locations without GPS metadata. Traditional retrieval methods are constrained by database coverage and quality. Recent Large Vision-Language Models (LVLMs) enable direct location reasoning from image content, yet individual models struggle with diverse geographic regions and complex scenes. Existing multi-agent systems improve performance through model collaboration but treat all agent interactions uniformly. They lack mechanisms to handle conflicting predictions effectively. We propose \textbf{GraphGeo}, a multi-agent debate framework using heterogeneous graph neural networks for visual geo-localization. Our approach models diverse debate relationships through typed edges, distinguishing supportive collaboration, competitive argumentation, and knowledge transfer. We introduce a dual-level debate mechanism combining node-level refinement and edge-level argumentation modeling. A cross-level topology refinement strategy enables co-evolution between graph structure and agent representations. Experiments on multiple benchmarks demonstrate GraphGeo significantly outperforms state-of-the-art methods. Our framework transforms cognitive conflicts between agents into enhanced geo-localization accuracy through structured debate.
format Preprint
id arxiv_https___arxiv_org_abs_2511_00908
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle GraphGeo: Multi-Agent Debate Framework for Visual Geo-localization with Heterogeneous Graph Neural Networks
Zheng, Heng
Shi, Yuling
Gu, Xiaodong
You, Haochen
Zhang, Zijian
Gan, Lubin
Zhang, Hao
Huang, Wenjun
Huang, Jin
Computer Vision and Pattern Recognition
Graphics
Visual geo-localization requires extensive geographic knowledge and sophisticated reasoning to determine image locations without GPS metadata. Traditional retrieval methods are constrained by database coverage and quality. Recent Large Vision-Language Models (LVLMs) enable direct location reasoning from image content, yet individual models struggle with diverse geographic regions and complex scenes. Existing multi-agent systems improve performance through model collaboration but treat all agent interactions uniformly. They lack mechanisms to handle conflicting predictions effectively. We propose \textbf{GraphGeo}, a multi-agent debate framework using heterogeneous graph neural networks for visual geo-localization. Our approach models diverse debate relationships through typed edges, distinguishing supportive collaboration, competitive argumentation, and knowledge transfer. We introduce a dual-level debate mechanism combining node-level refinement and edge-level argumentation modeling. A cross-level topology refinement strategy enables co-evolution between graph structure and agent representations. Experiments on multiple benchmarks demonstrate GraphGeo significantly outperforms state-of-the-art methods. Our framework transforms cognitive conflicts between agents into enhanced geo-localization accuracy through structured debate.
title GraphGeo: Multi-Agent Debate Framework for Visual Geo-localization with Heterogeneous Graph Neural Networks
topic Computer Vision and Pattern Recognition
Graphics
url https://arxiv.org/abs/2511.00908