:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Zhongliang, Zhang, Jielu, Guan, Zihan, Hu, Mengxuan, Lao, Ni, Mu, Lan, Li, Sheng, Mai, Gengchen
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2403.19584
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models
by: Zhang, Jielu, et al.
Published: (2023)

LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
by: Wang, Zhangyu, et al.
Published: (2025)

Enhanced Diagnostic Performance via Large-Resolution Inference Optimization for Pathology Foundation Models
by: Hu, Mengxuan, et al.
Published: (2026)

MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
by: Wang, Zhangyu, et al.
Published: (2024)

BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing
by: Guo, Dongliang, et al.
Published: (2025)

No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
by: Hu, Mengxuan, et al.
Published: (2024)

TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning
by: Wu, Nemin, et al.
Published: (2024)

GAIR: Location-Aware Self-Supervised Contrastive Pre-Training with Geo-Aligned Implicit Representations
by: Liu, Zeping, et al.
Published: (2025)

UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models
by: Guan, Zihan, et al.
Published: (2024)

ImLoc: Revisiting Visual Localization with Image-based Representation
by: Jiang, Xudong, et al.
Published: (2026)

CurriculumLoc: Enhancing Cross-Domain Geolocalization through Multi-Stage Refinement
by: Hu, Boni, et al.
Published: (2023)

Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety
by: Guan, Zihan, et al.
Published: (2025)

GeoLocSFT: Efficient Visual Geolocation via Supervised Fine-Tuning of Multimodal Foundation Models
by: Yi, Qiang, et al.
Published: (2025)

Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions
by: Yu, Dazhou, et al.
Published: (2025)

Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines
by: Ma, Zi-Ao, et al.
Published: (2024)

HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation
by: Gadi, Hari Krishna, et al.
Published: (2026)

Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing
by: Guo, Dongliang, et al.
Published: (2024)

TRAJGANR: Trajectory-Centric Urban Multimodal Learning via Geospatially Aligned Neural Representations
by: Siampou, Maria Despoina, et al.
Published: (2026)

ProxyImg: Towards Highly-Controllable Image Representation via Hierarchical Disentangled Proxy Embedding
by: Chen, Ye, et al.
Published: (2026)

Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery
by: Maniyar, Chintan B., et al.
Published: (2025)

ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
by: Tao, Xijia, et al.
Published: (2024)

ImgEdit: A Unified Image Editing Dataset and Benchmark
by: Ye, Yang, et al.
Published: (2025)

ManuRAG: Multi-modal Retrieval Augmented Generation for Manufacturing Question Answering (Early Version)
by: Li, Yunqing, et al.
Published: (2026)

Provably Secure Retrieval-Augmented Generation
by: Zhou, Pengcheng, et al.
Published: (2025)

Cross-View Geolocalization and Disaster Mapping with Street-View and VHR Satellite Imagery: A Case Study of Hurricane IAN
by: Li, Hao, et al.
Published: (2024)

Image Quality Assessment: Exploring Regional Heterogeneity via Response of Adaptive Multiple Quality Factors in Dictionary Space
by: Lan, Xuting, et al.
Published: (2024)

AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024)

CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps
by: Matsuzaki, Shigemichi, et al.
Published: (2024)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
by: Lai, Zhengfeng, et al.
Published: (2024)

Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion
by: Tan, Shiyu, et al.
Published: (2026)

GeoSearch: Augmenting Worldwide Geolocalization with Web-Scale Reverse Image Search and Image Matching
by: Le-Duc, Tung-Duong, et al.
Published: (2026)

Image Quality Assessment: Exploring Quality Awareness via Memory-driven Distortion Patterns Matching
by: Lan, Xuting, et al.
Published: (2026)

Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment
by: Hu, Mengxuan, et al.
Published: (2026)

PIGEON: Predicting Image Geolocations
by: Haas, Lukas, et al.
Published: (2023)

Comparing Traditional and LLM-based Search for Image Geolocation
by: Wazzan, Albatool, et al.
Published: (2024)

Cross-modal RAG: Sub-dimensional Text-to-Image Retrieval-Augmented Generation
by: Zhu, Mengdan, et al.
Published: (2025)

Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework
by: Guan, Zihan, et al.
Published: (2026)

3DProxyImg: Controllable 3D-Aware Animation Synthesis from Single Image via 2D-3D Aligned Proxy Embedding
by: Zhu, Yupeng, et al.
Published: (2025)

6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction
by: Gieruc, Théo, et al.
Published: (2024)

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
by: Yan, Zhiyuan, et al.
Published: (2025)