Saved in:
| Main Authors: | Abdali, Sara, shaham, Sina, Krishnamachari, Bhaskar |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2203.13883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge
by: Wu, Bo, et al.
Published: (2024)
by: Wu, Bo, et al.
Published: (2024)
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
by: Long, Zijun, et al.
Published: (2024)
by: Long, Zijun, et al.
Published: (2024)
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection
by: Qi, Peng, et al.
Published: (2024)
by: Qi, Peng, et al.
Published: (2024)
Personality Analysis from Online Short Video Platforms with Multi-domain Adaptation
by: An, Sixu, et al.
Published: (2024)
by: An, Sixu, et al.
Published: (2024)
Visual Authority and the Rhetoric of Health Misinformation: A Multimodal Analysis of Social Media Videos
by: Zarei, Mohammad Reza, et al.
Published: (2025)
by: Zarei, Mohammad Reza, et al.
Published: (2025)
NativE: Multi-modal Knowledge Graph Completion in the Wild
by: Zhang, Yichi, et al.
Published: (2024)
by: Zhang, Yichi, et al.
Published: (2024)
VQualA 2025 Challenge on Engagement Prediction for Short Videos: Methods and Results
by: Li, Dasong, et al.
Published: (2025)
by: Li, Dasong, et al.
Published: (2025)
Detecting Misinformation in Multimedia Content through Cross-Modal Entity Consistency: A Dual Learning Approach
by: Fu, Zhe, et al.
Published: (2024)
by: Fu, Zhe, et al.
Published: (2024)
Harmful YouTube Video Detection: A Taxonomy of Online Harm and MLLMs as Alternative Annotators
by: Jo, Claire Wonjeong, et al.
Published: (2024)
by: Jo, Claire Wonjeong, et al.
Published: (2024)
Integration of Policy and Reputation based Trust Mechanisms in e-Commerce Industry
by: Siddiqui, Muhammad Yasir, et al.
Published: (2024)
by: Siddiqui, Muhammad Yasir, et al.
Published: (2024)
VGA: Vision and Graph Fused Attention Network for Rumor Detection
by: Bai, Lin, et al.
Published: (2024)
by: Bai, Lin, et al.
Published: (2024)
More than Memes: A Multimodal Topic Modeling Approach to Conspiracy Theories on Telegram
by: Steffen, Elisabeth
Published: (2024)
by: Steffen, Elisabeth
Published: (2024)
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
by: Wu, Daiqing, et al.
Published: (2024)
by: Wu, Daiqing, et al.
Published: (2024)
Can LLMs Create Legally Relevant Summaries and Analyses of Videos?
by: Hoeben-Kuil, Lyra, et al.
Published: (2025)
by: Hoeben-Kuil, Lyra, et al.
Published: (2025)
KI-Bilder und die Widerständigkeit der Medienkonvergenz: Von primärer zu sekundärer Intermedialität?
by: Wilde, Lukas R. A.
Published: (2024)
by: Wilde, Lukas R. A.
Published: (2024)
Towards nation-wide analytical healthcare infrastructures: A privacy-preserving augmented knee rehabilitation case study
by: Bačić, Boris, et al.
Published: (2024)
by: Bačić, Boris, et al.
Published: (2024)
TraceRouter: Robust Safety for Large Foundation Models via Path-Level Intervention
by: Shi, Chuancheng, et al.
Published: (2026)
by: Shi, Chuancheng, et al.
Published: (2026)
AI-based System for Transforming text and sound to Educational Videos
by: ElAlami, M. E., et al.
Published: (2026)
by: ElAlami, M. E., et al.
Published: (2026)
ObjFormer: Learning Land-Cover Changes From Paired OSM Data and Optical High-Resolution Imagery via Object-Guided Transformer
by: Chen, Hongruixuan, et al.
Published: (2023)
by: Chen, Hongruixuan, et al.
Published: (2023)
Delving Deep into Engagement Prediction of Short Videos
by: Li, Dasong, et al.
Published: (2024)
by: Li, Dasong, et al.
Published: (2024)
A Rate-Distortion-Classification Approach for Lossy Image Compression
by: Zhang, Yuefeng
Published: (2024)
by: Zhang, Yuefeng
Published: (2024)
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech
by: Liu, Rui, et al.
Published: (2024)
by: Liu, Rui, et al.
Published: (2024)
LLM-based Fusion of Multi-modal Features for Commercial Memorability Prediction
by: Pramov, Aleksandar
Published: (2025)
by: Pramov, Aleksandar
Published: (2025)
Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval
by: Fang, Xiang, et al.
Published: (2022)
by: Fang, Xiang, et al.
Published: (2022)
FBHM: Functional Benchmarking and Steering of VLMs for Hateful Meme Detection
by: Bhaskar, Paramananda, et al.
Published: (2026)
by: Bhaskar, Paramananda, et al.
Published: (2026)
Multi-modal Segment Assemblage Network for Ad Video Editing with Importance-Coherence Reward
by: Tang, Yolo Yunlong, et al.
Published: (2022)
by: Tang, Yolo Yunlong, et al.
Published: (2022)
PointCoT: A Multi-modal Benchmark for Explicit 3D Geometric Reasoning
by: Zhang, Dongxu, et al.
Published: (2026)
by: Zhang, Dongxu, et al.
Published: (2026)
Mitigating GenAI-powered Evidence Pollution for Out-of-Context Multimodal Misinformation Detection
by: Yan, Zehong, et al.
Published: (2025)
by: Yan, Zehong, et al.
Published: (2025)
Distilling Neuro-Symbolic Programs into 3D Multi-modal LLMs
by: Mo, Wentao, et al.
Published: (2026)
by: Mo, Wentao, et al.
Published: (2026)
LazyVLM: Neuro-Symbolic Approach to Video Analytics
by: Jian, Xiangru, et al.
Published: (2025)
by: Jian, Xiangru, et al.
Published: (2025)
Toxic Memes: A Survey of Computational Perspectives on the Detection and Explanation of Meme Toxicities
by: Pandiani, Delfina Sol Martinez, et al.
Published: (2024)
by: Pandiani, Delfina Sol Martinez, et al.
Published: (2024)
MORALISE: A Structured Benchmark for Moral Alignment in Visual Language Models
by: Lin, Xiao, et al.
Published: (2025)
by: Lin, Xiao, et al.
Published: (2025)
Cross-modal Causal Intervention for Alzheimer's Disease Prediction
by: Jin, Yutao, et al.
Published: (2025)
by: Jin, Yutao, et al.
Published: (2025)
CalliffusionV2: Personalized Natural Calligraphy Generation with Flexible Multi-modal Control
by: Liao, Qisheng, et al.
Published: (2024)
by: Liao, Qisheng, et al.
Published: (2024)
CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection
by: Li, Fanxiao, et al.
Published: (2025)
by: Li, Fanxiao, et al.
Published: (2025)
GeoLocator: a location-integrated large multimodal model for inferring geo-privacy
by: Yang, Yifan, et al.
Published: (2023)
by: Yang, Yifan, et al.
Published: (2023)
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing
by: Tian, Zeyue, et al.
Published: (2026)
by: Tian, Zeyue, et al.
Published: (2026)
Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs
by: Song, Dingjie, et al.
Published: (2024)
by: Song, Dingjie, et al.
Published: (2024)
PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval
by: Xu, Tianyi, et al.
Published: (2026)
by: Xu, Tianyi, et al.
Published: (2026)
Cross-Modal Retrieval with Cauchy-Schwarz Divergence
by: Zhang, Jiahao, et al.
Published: (2025)
by: Zhang, Jiahao, et al.
Published: (2025)
Similar Items
-
SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge
by: Wu, Bo, et al.
Published: (2024) -
CrisisViT: A Robust Vision Transformer for Crisis Image Classification
by: Long, Zijun, et al.
Published: (2024) -
SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection
by: Qi, Peng, et al.
Published: (2024) -
Personality Analysis from Online Short Video Platforms with Multi-domain Adaptation
by: An, Sixu, et al.
Published: (2024) -
Visual Authority and the Rhetoric of Health Misinformation: A Multimodal Analysis of Social Media Videos
by: Zarei, Mohammad Reza, et al.
Published: (2025)