Saved in:
| Main Authors: | Howell, Anthony, Wu, Nancy, Bagchi, Sharmistha, Kim, Yushim, Sun, Chayn |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15132 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Reproducible Workflow for Scraping, Structuring, and Segmenting Legacy Archaeological Artifact Images
by: Palomeque-Gonzalez, Juan
Published: (2025)
by: Palomeque-Gonzalez, Juan
Published: (2025)
Decoding Tourist Perception in Historic Urban Quarters with Multimodal Social Media Data: An AI-Based Framework and Evidence from Shanghai
by: Tan, Kaizhen, et al.
Published: (2025)
by: Tan, Kaizhen, et al.
Published: (2025)
TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis
by: Ji, Chunhou, et al.
Published: (2025)
by: Ji, Chunhou, et al.
Published: (2025)
BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2025)
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2025)
PLAS-Net: Pixel-Level Area Segmentation for UAV-Based Beach Litter Monitoring
by: Liu, Yongying, et al.
Published: (2026)
by: Liu, Yongying, et al.
Published: (2026)
BuildingView: Constructing Urban Building Exteriors Databases with Street View Imagery and Multimodal Large Language Mode
by: Li, Zongrong, et al.
Published: (2024)
by: Li, Zongrong, et al.
Published: (2024)
MINGLE: VLMs for Semantically Complex Region Detection in Urban Scenes
by: Liu, Liu, et al.
Published: (2025)
by: Liu, Liu, et al.
Published: (2025)
From Pixels to Words: Leveraging Explainability in Face Recognition through Interactive Natural Language Processing
by: DeAndres-Tame, Ivan, et al.
Published: (2024)
by: DeAndres-Tame, Ivan, et al.
Published: (2024)
Predicting Local Climate Zones using Urban Morphometrics and Satellite Imagery
by: Majer, Hugo, et al.
Published: (2026)
by: Majer, Hugo, et al.
Published: (2026)
Deep Umbra: A Generative Approach for Sunlight Access Computation in Urban Spaces
by: Omar, Kazi Shahrukh, et al.
Published: (2024)
by: Omar, Kazi Shahrukh, et al.
Published: (2024)
How Many Visual Levers Drive Urban Perception? Interventional Counterfactuals via Multiple Localised Edits
by: Tang, Jason, et al.
Published: (2026)
by: Tang, Jason, et al.
Published: (2026)
From Content to Audience: A Multimodal Annotation Framework for Broadcast Television Analytics
by: Cupini, Paolo, et al.
Published: (2026)
by: Cupini, Paolo, et al.
Published: (2026)
LLM-Driven Completeness and Consistency Evaluation for Cultural Heritage Data Augmentation in Cross-Modal Retrieval
by: Zhang, Jian, et al.
Published: (2025)
by: Zhang, Jian, et al.
Published: (2025)
Ethical Considerations for the Military Use of Artificial Intelligence in Visual Reconnaissance
by: Anneken, Mathias, et al.
Published: (2025)
by: Anneken, Mathias, et al.
Published: (2025)
ERIT Lightweight Multimodal Dataset for Elderly Emotion Recognition and Multimodal Fusion Evaluation
by: Frieske, Rita, et al.
Published: (2024)
by: Frieske, Rita, et al.
Published: (2024)
AI-Generated Figures in Academic Publishing: Policies, Tools, and Practical Guidelines
by: Chen, Davie
Published: (2026)
by: Chen, Davie
Published: (2026)
Urban Socio-Semantic Segmentation with Vision-Language Reasoning
by: Wang, Yu, et al.
Published: (2026)
by: Wang, Yu, et al.
Published: (2026)
Training-Free Multimodal Deepfake Detection via Graph Reasoning
by: Liu, Yuxin, et al.
Published: (2025)
by: Liu, Yuxin, et al.
Published: (2025)
Seeing Candidates at Scale: Multimodal LLMs for Visual Political Communication on Instagram
by: Achmann-Denkler, Michael, et al.
Published: (2026)
by: Achmann-Denkler, Michael, et al.
Published: (2026)
Diagnosing Urban Street Vitality via a Visual-Semantic and Spatiotemporal Framework for Street-Level Economics
by: Zhuo, Xinxin, et al.
Published: (2026)
by: Zhuo, Xinxin, et al.
Published: (2026)
Recovering Parametric Scenes from Very Few Time-of-Flight Pixels
by: Sifferman, Carter, et al.
Published: (2025)
by: Sifferman, Carter, et al.
Published: (2025)
Surgeons Awareness, Expectations, and Involvement with Artificial Intelligence: a Survey Pre and Post the GPT Era
by: Arboit, Lorenzo, et al.
Published: (2025)
by: Arboit, Lorenzo, et al.
Published: (2025)
Learning Multimodal Cues of Children's Uncertainty
by: Cheng, Qi, et al.
Published: (2024)
by: Cheng, Qi, et al.
Published: (2024)
Vitamin N: Benefits of Different Forms of Public Greenery for Urban Health
by: Šćepanović, Sanja, et al.
Published: (2025)
by: Šćepanović, Sanja, et al.
Published: (2025)
Using Multimodal Large Language Models for Automated Detection of Traffic Safety Critical Events
by: Tami, Mohammad Abu, et al.
Published: (2024)
by: Tami, Mohammad Abu, et al.
Published: (2024)
From Review to Design: Ethical Multimodal Driver Monitoring Systems for Risk Mitigation, Incident Response, and Accountability in Automated Vehicles
by: Khana, Bilal, et al.
Published: (2026)
by: Khana, Bilal, et al.
Published: (2026)
From Reasoning to Pixels: Benchmarking the Alignment Gap in Unified Multimodal Models
by: Yang, Cheng, et al.
Published: (2026)
by: Yang, Cheng, et al.
Published: (2026)
Two Stage Context Learning with Large Language Models for Multimodal Stance Detection on Climate Change
by: Pangtey, Lata, et al.
Published: (2025)
by: Pangtey, Lata, et al.
Published: (2025)
AI's Blind Spots: Geographic Knowledge and Diversity Deficit in Generated Urban Scenario
by: Beneduce, Ciro, et al.
Published: (2025)
by: Beneduce, Ciro, et al.
Published: (2025)
Are Multimodal LLMs Ready for Clinical Dermatology? A Real-World Evaluation in Dermatology
by: Jiang, Roy, et al.
Published: (2026)
by: Jiang, Roy, et al.
Published: (2026)
EDU-CIRCUIT-HW: Evaluating Multimodal Large Language Models on Real-World University-Level STEM Student Handwritten Solutions
by: Sun, Weiyu, et al.
Published: (2026)
by: Sun, Weiyu, et al.
Published: (2026)
Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach
by: Duan, Siyu, et al.
Published: (2024)
by: Duan, Siyu, et al.
Published: (2024)
Do Street View Imagery and Public Participation GIS align: Comparative Analysis of Urban Attractiveness
by: Malekzadeh, Milad, et al.
Published: (2025)
by: Malekzadeh, Milad, et al.
Published: (2025)
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms
by: Jin, Yiqiao, et al.
Published: (2024)
by: Jin, Yiqiao, et al.
Published: (2024)
A High Resolution Urban and Rural Settlement Map of Africa Using Deep Learning and Satellite Imagery
by: Kakooei, Mohammad, et al.
Published: (2024)
by: Kakooei, Mohammad, et al.
Published: (2024)
Monitoring of Urban Changes with multi-modal Sentinel 1 and 2 Data in Mariupol, Ukraine, in 2022/23
by: Zitzlsberger, Georg, et al.
Published: (2023)
by: Zitzlsberger, Georg, et al.
Published: (2023)
Multimodal Political Bias Identification and Neutralization
by: Bernard, Cedric, et al.
Published: (2025)
by: Bernard, Cedric, et al.
Published: (2025)
Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study
by: Washington, Peter, et al.
Published: (2020)
by: Washington, Peter, et al.
Published: (2020)
Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering
by: Si, Chenglei, et al.
Published: (2024)
by: Si, Chenglei, et al.
Published: (2024)
No One Knows the State of the Art in Geospatial Foundation Models
by: Corley, Isaac, et al.
Published: (2026)
by: Corley, Isaac, et al.
Published: (2026)
Similar Items
-
A Reproducible Workflow for Scraping, Structuring, and Segmenting Legacy Archaeological Artifact Images
by: Palomeque-Gonzalez, Juan
Published: (2025) -
Decoding Tourist Perception in Historic Urban Quarters with Multimodal Social Media Data: An AI-Based Framework and Evidence from Shanghai
by: Tan, Kaizhen, et al.
Published: (2025) -
TrajSceneLLM: A Multimodal Perspective on Semantic GPS Trajectory Analysis
by: Ji, Chunhou, et al.
Published: (2025) -
BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models
by: Tan, Bryan Chen Zhengyu, et al.
Published: (2025) -
PLAS-Net: Pixel-Level Area Segmentation for UAV-Based Beach Litter Monitoring
by: Liu, Yongying, et al.
Published: (2026)