:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wu, Nemin, Cao, Qian, Wang, Zhangyu, Liu, Zeping, Qi, Yanlin, Zhang, Jielu, Ni, Joshua, Yao, Xiaobai, Ma, Hongxu, Mu, Lan, Ermon, Stefano, Ganu, Tanuja, Nambi, Akshay, Lao, Ni, Mai, Gengchen
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2406.15658
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GeoBS: Information-Theoretic Quantification of Geographic Bias in AI Models
by: Wang, Zhangyu, et al.
Published: (2025)

LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
by: Wang, Zhangyu, et al.
Published: (2025)

GAIR: Location-Aware Self-Supervised Contrastive Pre-Training with Geo-Aligned Implicit Representations
by: Liu, Zeping, et al.
Published: (2025)

MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
by: Wang, Zhangyu, et al.
Published: (2024)

MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning
by: Kumar, Somnath, et al.
Published: (2024)

Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation
by: Zhou, Zhongliang, et al.
Published: (2024)

Probing the Information Theoretical Roots of Spatial Dependence Measures
by: Wang, Zhangyu, et al.
Published: (2024)

PromptWizard: Task-Aware Prompt Optimization Framework
by: Agarwal, Eshaan, et al.
Published: (2024)

EnCortex: A General, Extensible and Scalable Framework for Decision Management in New-age Energy Systems
by: Roy, Millend, et al.
Published: (2025)

Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models
by: Zhang, Jielu, et al.
Published: (2023)

Bridging the Language Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs
by: Kumar, Somnath, et al.
Published: (2023)

Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs
by: Kumar, Somnath, et al.
Published: (2024)

Do You See Me : A Multidimensional Benchmark for Evaluating Visual Perception in Multimodal LLMs
by: Kanade, Aditya, et al.
Published: (2025)

Exposing Weak Links in Multi-Agent Systems under Adversarial Prompting
by: Arora, Nirmit, et al.
Published: (2025)

Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs
by: Kancheti, Sai Srinivas, et al.
Published: (2026)

SpatialMath: Spatial Comprehension-Infused Symbolic Reasoning for Mathematical Problem-Solving
by: Bajpai, Ashutosh, et al.
Published: (2026)

Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
by: Wang, Hengyi, et al.
Published: (2024)

Faithful GRPO: Improving Visual Spatial Reasoning in Multimodal Language Models via Constrained Policy Optimization
by: Kancheti, Sai Srinivas, et al.
Published: (2026)

RadPhi-3: Small Language Models for Radiology
by: Ranjit, Mercy, et al.
Published: (2024)

Shiksha Copilot: Teacher-AI Collaboration for Curating and Customizing Lesson Plans in Low-Resource Schools
by: Dennison, Deepak Varuvel, et al.
Published: (2025)

GeoLLM: Extracting Geospatial Knowledge from Large Language Models
by: Manvi, Rohin, et al.
Published: (2023)

TRAJGANR: Trajectory-Centric Urban Multimodal Learning via Geospatially Aligned Neural Representations
by: Siampou, Maria Despoina, et al.
Published: (2026)

Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions
by: Yu, Dazhou, et al.
Published: (2025)

Designing Culturally Aligned AI Systems For Social Good in Non-Western Contexts
by: Dennison, Deepak Varuvel, et al.
Published: (2025)

TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems
by: Kavathekar, Ishan, et al.
Published: (2025)

Evaluating LLMs' Mathematical Reasoning in Financial Document Question Answering
by: Srivastava, Pragya, et al.
Published: (2024)

RAD-PHI2: Instruction Tuning PHI-2 for Radiology
by: Ranjit, Mercy, et al.
Published: (2024)

Spatial-Agent: Agentic Geo-spatial Reasoning with Scientific Core Concepts
by: Bao, Riyang, et al.
Published: (2026)

Geography According to ChatGPT -- How Generative AI Represents and Reasons about Geography
by: Janowicz, Krzysztof, et al.
Published: (2026)

Mind's Eye: A Benchmark of Visual Abstraction, Transformation and Composition for Multimodal LLMs
by: Sinha, Rohit, et al.
Published: (2026)

Automated Image-Based Identification and Consistent Classification of Fire Patterns with Quantitative Shape Analysis and Spatial Location Identification
by: Liu, Pengkun, et al.
Published: (2024)

Self-Evolved Preference Optimization for Enhancing Mathematical Reasoning in Small Language Models
by: Singh, Joykirat, et al.
Published: (2025)

Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
by: Singh, Joykirat, et al.
Published: (2024)

An Empirical Study on the Impact of Positional Encoding in Transformer-based Monaural Speech Enhancement
by: Zhang, Qiquan, et al.
Published: (2024)

LAID: Lightweight AI-Generated Image Detection in Spatial and Spectral Domains
by: Chivaran, Nicholas, et al.
Published: (2025)

Neural Radiance Fields with Torch Units
by: Ni, Bingnan, et al.
Published: (2024)

Whose Truth? Pluralistic Geo-Alignment for (Agentic) AI
by: Janowicz, Krzysztof, et al.
Published: (2025)

Waking Up Blind: Cold-Start Optimization of Supervision-Free Agentic Trajectories for Grounded Visual Perception
by: Bajpai, Ashutosh, et al.
Published: (2026)

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning
by: Singh, Joykirat, et al.
Published: (2025)

Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs
by: Yu, Zeping, et al.
Published: (2025)