Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.00908 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915689489498112 |
|---|---|
| author | Zheng, Heng Shi, Yuling Gu, Xiaodong You, Haochen Zhang, Zijian Gan, Lubin Zhang, Hao Huang, Wenjun Huang, Jin |
| author_facet | Zheng, Heng Shi, Yuling Gu, Xiaodong You, Haochen Zhang, Zijian Gan, Lubin Zhang, Hao Huang, Wenjun Huang, Jin |
| contents | Visual geo-localization requires extensive geographic knowledge and sophisticated reasoning to determine image locations without GPS metadata. Traditional retrieval methods are constrained by database coverage and quality. Recent Large Vision-Language Models (LVLMs) enable direct location reasoning from image content, yet individual models struggle with diverse geographic regions and complex scenes. Existing multi-agent systems improve performance through model collaboration but treat all agent interactions uniformly. They lack mechanisms to handle conflicting predictions effectively. We propose \textbf{GraphGeo}, a multi-agent debate framework using heterogeneous graph neural networks for visual geo-localization. Our approach models diverse debate relationships through typed edges, distinguishing supportive collaboration, competitive argumentation, and knowledge transfer. We introduce a dual-level debate mechanism combining node-level refinement and edge-level argumentation modeling. A cross-level topology refinement strategy enables co-evolution between graph structure and agent representations. Experiments on multiple benchmarks demonstrate GraphGeo significantly outperforms state-of-the-art methods. Our framework transforms cognitive conflicts between agents into enhanced geo-localization accuracy through structured debate. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2511_00908 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | GraphGeo: Multi-Agent Debate Framework for Visual Geo-localization with Heterogeneous Graph Neural Networks Zheng, Heng Shi, Yuling Gu, Xiaodong You, Haochen Zhang, Zijian Gan, Lubin Zhang, Hao Huang, Wenjun Huang, Jin Computer Vision and Pattern Recognition Graphics Visual geo-localization requires extensive geographic knowledge and sophisticated reasoning to determine image locations without GPS metadata. Traditional retrieval methods are constrained by database coverage and quality. Recent Large Vision-Language Models (LVLMs) enable direct location reasoning from image content, yet individual models struggle with diverse geographic regions and complex scenes. Existing multi-agent systems improve performance through model collaboration but treat all agent interactions uniformly. They lack mechanisms to handle conflicting predictions effectively. We propose \textbf{GraphGeo}, a multi-agent debate framework using heterogeneous graph neural networks for visual geo-localization. Our approach models diverse debate relationships through typed edges, distinguishing supportive collaboration, competitive argumentation, and knowledge transfer. We introduce a dual-level debate mechanism combining node-level refinement and edge-level argumentation modeling. A cross-level topology refinement strategy enables co-evolution between graph structure and agent representations. Experiments on multiple benchmarks demonstrate GraphGeo significantly outperforms state-of-the-art methods. Our framework transforms cognitive conflicts between agents into enhanced geo-localization accuracy through structured debate. |
| title | GraphGeo: Multi-Agent Debate Framework for Visual Geo-localization with Heterogeneous Graph Neural Networks |
| topic | Computer Vision and Pattern Recognition Graphics |
| url | https://arxiv.org/abs/2511.00908 |