Saved in:
| Main Authors: | Zhou, Zhongliang, Zhang, Jielu, Guan, Zihan, Hu, Mengxuan, Lao, Ni, Mu, Lan, Li, Sheng, Mai, Gengchen |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.19584 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models
by: Zhang, Jielu, et al.
Published: (2023)
by: Zhang, Jielu, et al.
Published: (2023)
LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
by: Wang, Zhangyu, et al.
Published: (2025)
by: Wang, Zhangyu, et al.
Published: (2025)
Enhanced Diagnostic Performance via Large-Resolution Inference Optimization for Pathology Foundation Models
by: Hu, Mengxuan, et al.
Published: (2026)
by: Hu, Mengxuan, et al.
Published: (2026)
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
by: Wang, Zhangyu, et al.
Published: (2024)
by: Wang, Zhangyu, et al.
Published: (2024)
BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing
by: Guo, Dongliang, et al.
Published: (2025)
by: Guo, Dongliang, et al.
Published: (2025)
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
by: Hu, Mengxuan, et al.
Published: (2024)
by: Hu, Mengxuan, et al.
Published: (2024)
TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning
by: Wu, Nemin, et al.
Published: (2024)
by: Wu, Nemin, et al.
Published: (2024)
GAIR: Location-Aware Self-Supervised Contrastive Pre-Training with Geo-Aligned Implicit Representations
by: Liu, Zeping, et al.
Published: (2025)
by: Liu, Zeping, et al.
Published: (2025)
UFID: A Unified Framework for Input-level Backdoor Detection on Diffusion Models
by: Guan, Zihan, et al.
Published: (2024)
by: Guan, Zihan, et al.
Published: (2024)
ImLoc: Revisiting Visual Localization with Image-based Representation
by: Jiang, Xudong, et al.
Published: (2026)
by: Jiang, Xudong, et al.
Published: (2026)
CurriculumLoc: Enhancing Cross-Domain Geolocalization through Multi-Stage Refinement
by: Hu, Boni, et al.
Published: (2023)
by: Hu, Boni, et al.
Published: (2023)
Benign Samples Matter! Fine-tuning On Outlier Benign Samples Severely Breaks Safety
by: Guan, Zihan, et al.
Published: (2025)
by: Guan, Zihan, et al.
Published: (2025)
GeoLocSFT: Efficient Visual Geolocation via Supervised Fine-Tuning of Multimodal Foundation Models
by: Yi, Qiang, et al.
Published: (2025)
by: Yi, Qiang, et al.
Published: (2025)
Spatial-RAG: Spatial Retrieval Augmented Generation for Real-World Geospatial Reasoning Questions
by: Yu, Dazhou, et al.
Published: (2025)
by: Yu, Dazhou, et al.
Published: (2025)
Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines
by: Ma, Zi-Ao, et al.
Published: (2024)
by: Ma, Zi-Ao, et al.
Published: (2024)
HierLoc: Hyperbolic Entity Embeddings for Hierarchical Visual Geolocation
by: Gadi, Hari Krishna, et al.
Published: (2026)
by: Gadi, Hari Krishna, et al.
Published: (2026)
Backdoor in Seconds: Unlocking Vulnerabilities in Large Pre-trained Models via Model Editing
by: Guo, Dongliang, et al.
Published: (2024)
by: Guo, Dongliang, et al.
Published: (2024)
TRAJGANR: Trajectory-Centric Urban Multimodal Learning via Geospatially Aligned Neural Representations
by: Siampou, Maria Despoina, et al.
Published: (2026)
by: Siampou, Maria Despoina, et al.
Published: (2026)
ProxyImg: Towards Highly-Controllable Image Representation via Hierarchical Disentangled Proxy Embedding
by: Chen, Ye, et al.
Published: (2026)
by: Chen, Ye, et al.
Published: (2026)
Feature-Augmented Deep Networks for Multiscale Building Segmentation in High-Resolution UAV and Satellite Imagery
by: Maniyar, Chintan B., et al.
Published: (2025)
by: Maniyar, Chintan B., et al.
Published: (2025)
ImgTrojan: Jailbreaking Vision-Language Models with ONE Image
by: Tao, Xijia, et al.
Published: (2024)
by: Tao, Xijia, et al.
Published: (2024)
ImgEdit: A Unified Image Editing Dataset and Benchmark
by: Ye, Yang, et al.
Published: (2025)
by: Ye, Yang, et al.
Published: (2025)
ManuRAG: Multi-modal Retrieval Augmented Generation for Manufacturing Question Answering (Early Version)
by: Li, Yunqing, et al.
Published: (2026)
by: Li, Yunqing, et al.
Published: (2026)
Provably Secure Retrieval-Augmented Generation
by: Zhou, Pengcheng, et al.
Published: (2025)
by: Zhou, Pengcheng, et al.
Published: (2025)
Cross-View Geolocalization and Disaster Mapping with Street-View and VHR Satellite Imagery: A Case Study of Hurricane IAN
by: Li, Hao, et al.
Published: (2024)
by: Li, Hao, et al.
Published: (2024)
Image Quality Assessment: Exploring Regional Heterogeneity via Response of Adaptive Multiple Quality Factors in Dictionary Space
by: Lan, Xuting, et al.
Published: (2024)
by: Lan, Xuting, et al.
Published: (2024)
AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
by: Huang, Yipo, et al.
Published: (2024)
by: Huang, Yipo, et al.
Published: (2024)
CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
by: Matsuzaki, Shigemichi, et al.
Published: (2024)
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
by: Lai, Zhengfeng, et al.
Published: (2024)
by: Lai, Zhengfeng, et al.
Published: (2024)
Img2CADSeq: Image-to-CAD Generation via Sequence-Based Diffusion
by: Tan, Shiyu, et al.
Published: (2026)
by: Tan, Shiyu, et al.
Published: (2026)
GeoSearch: Augmenting Worldwide Geolocalization with Web-Scale Reverse Image Search and Image Matching
by: Le-Duc, Tung-Duong, et al.
Published: (2026)
by: Le-Duc, Tung-Duong, et al.
Published: (2026)
Image Quality Assessment: Exploring Quality Awareness via Memory-driven Distortion Patterns Matching
by: Lan, Xuting, et al.
Published: (2026)
by: Lan, Xuting, et al.
Published: (2026)
Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment
by: Hu, Mengxuan, et al.
Published: (2026)
by: Hu, Mengxuan, et al.
Published: (2026)
PIGEON: Predicting Image Geolocations
by: Haas, Lukas, et al.
Published: (2023)
by: Haas, Lukas, et al.
Published: (2023)
Comparing Traditional and LLM-based Search for Image Geolocation
by: Wazzan, Albatool, et al.
Published: (2024)
by: Wazzan, Albatool, et al.
Published: (2024)
Cross-modal RAG: Sub-dimensional Text-to-Image Retrieval-Augmented Generation
by: Zhu, Mengdan, et al.
Published: (2025)
by: Zhu, Mengdan, et al.
Published: (2025)
Are LLMs Ready for Neural-integrated Mechanistic Modeling? A Benchmark and Agentic Framework
by: Guan, Zihan, et al.
Published: (2026)
by: Guan, Zihan, et al.
Published: (2026)
3DProxyImg: Controllable 3D-Aware Animation Synthesis from Single Image via 2D-3D Aligned Proxy Embedding
by: Zhu, Yupeng, et al.
Published: (2025)
by: Zhu, Yupeng, et al.
Published: (2025)
6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction
by: Gieruc, Théo, et al.
Published: (2024)
by: Gieruc, Théo, et al.
Published: (2024)
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
by: Yan, Zhiyuan, et al.
Published: (2025)
by: Yan, Zhiyuan, et al.
Published: (2025)
Similar Items
-
Text2Seg: Remote Sensing Image Semantic Segmentation via Text-Guided Visual Foundation Models
by: Zhang, Jielu, et al.
Published: (2023) -
LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space
by: Wang, Zhangyu, et al.
Published: (2025) -
Enhanced Diagnostic Performance via Large-Resolution Inference Optimization for Pathology Foundation Models
by: Hu, Mengxuan, et al.
Published: (2026) -
MC-GTA: Metric-Constrained Model-Based Clustering using Goodness-of-fit Tests with Autocorrelations
by: Wang, Zhangyu, et al.
Published: (2024) -
BalancEdit: Dynamically Balancing the Generality-Locality Trade-off in Multi-modal Model Editing
by: Guo, Dongliang, et al.
Published: (2025)