Saved in:
| Main Authors: | Yu, Xinquan, Lu, Wei, Luo, Xiangyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.02479 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CIEC: Coupling Implicit and Explicit Cues for Multimodal Weakly Supervised Manipulation Localization
by: Yu, Xinquan, et al.
Published: (2026)
by: Yu, Xinquan, et al.
Published: (2026)
RaCMC: Residual-Aware Compensation Network with Multi-Granularity Constraints for Fake News Detection
by: Yu, Xinquan, et al.
Published: (2024)
by: Yu, Xinquan, et al.
Published: (2024)
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
by: He, Junwen, et al.
Published: (2024)
by: He, Junwen, et al.
Published: (2024)
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024)
by: Zhang, Yaning, et al.
Published: (2024)
Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval
by: Ma, Zehong, et al.
Published: (2025)
by: Ma, Zehong, et al.
Published: (2025)
Diffusion-based Adversarial Identity Manipulation for Facial Privacy Protection
by: Wang, Liqin, et al.
Published: (2025)
by: Wang, Liqin, et al.
Published: (2025)
Fine-grained Context and Multi-modal Alignment for Freehand 3D Ultrasound Reconstruction
by: Yan, Zhongnuo, et al.
Published: (2024)
by: Yan, Zhongnuo, et al.
Published: (2024)
MMHead: Towards Fine-grained Multi-modal 3D Facial Animation
by: Wu, Sijing, et al.
Published: (2024)
by: Wu, Sijing, et al.
Published: (2024)
MARE: Multimodal Alignment and Reinforcement for Explainable Deepfake Detection via Vision-Language Models
by: Xu, Wenbo, et al.
Published: (2026)
by: Xu, Wenbo, et al.
Published: (2026)
Fine-grained Action Analysis: A Multi-modality and Multi-task Dataset of Figure Skating
by: Liu, Sheng-Lan, et al.
Published: (2023)
by: Liu, Sheng-Lan, et al.
Published: (2023)
Simplify Implant Depth Prediction as Video Grounding: A Texture Perceive Implant Depth Prediction Network
by: Yang, Xinquan, et al.
Published: (2024)
by: Yang, Xinquan, et al.
Published: (2024)
Exploiting Modality-Specific Features For Multi-Modal Manipulation Detection And Grounding
by: Wang, Jiazhen, et al.
Published: (2023)
by: Wang, Jiazhen, et al.
Published: (2023)
ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification
by: Qu, Zuomin, et al.
Published: (2024)
by: Qu, Zuomin, et al.
Published: (2024)
Semantic Change Detection of Roads and Bridges: A Fine-grained Dataset and Multimodal Frequency-driven Detector
by: Shu, Qingling, et al.
Published: (2025)
by: Shu, Qingling, et al.
Published: (2025)
Weakly Supervised Multimodal Temporal Forgery Localization via Multitask Learning
by: Xu, Wenbo, et al.
Published: (2025)
by: Xu, Wenbo, et al.
Published: (2025)
Semi-distributed Cross-modal Air-Ground Relative Localization
by: Lu, Weining, et al.
Published: (2025)
by: Lu, Weining, et al.
Published: (2025)
Multi-view Crowd Tracking Transformer with View-Ground Interactions Under Large Real-World Scenes
by: Zhang, Qi, et al.
Published: (2026)
by: Zhang, Qi, et al.
Published: (2026)
Multi-modal Representations for Fine-grained Multi-label Critical View of Safety Recognition
by: Baby, Britty, et al.
Published: (2025)
by: Baby, Britty, et al.
Published: (2025)
Fine-grained Spatiotemporal Grounding on Egocentric Videos
by: Liang, Shuo, et al.
Published: (2025)
by: Liang, Shuo, et al.
Published: (2025)
FineCLIPER: Multi-modal Fine-grained CLIP for Dynamic Facial Expression Recognition with AdaptERs
by: Chen, Haodong, et al.
Published: (2024)
by: Chen, Haodong, et al.
Published: (2024)
BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation
by: Lan, Zihan, et al.
Published: (2025)
by: Lan, Zihan, et al.
Published: (2025)
Fine-grained Dynamic Network for Generic Event Boundary Detection
by: Zheng, Ziwei, et al.
Published: (2024)
by: Zheng, Ziwei, et al.
Published: (2024)
Language-driven Fine-grained Retrieval
by: Wang, Shijie, et al.
Published: (2025)
by: Wang, Shijie, et al.
Published: (2025)
Towards Open-world Generalized Deepfake Detection: General Feature Extraction via Unsupervised Domain Adaptation
by: Guo, Midou, et al.
Published: (2025)
by: Guo, Midou, et al.
Published: (2025)
Multi-modality Anomaly Segmentation on the Road
by: Gao, Heng, et al.
Published: (2025)
by: Gao, Heng, et al.
Published: (2025)
FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning
by: Zhang, Lu, et al.
Published: (2025)
by: Zhang, Lu, et al.
Published: (2025)
Q-Ground: Image Quality Grounding with Large Multi-modality Models
by: Chen, Chaofeng, et al.
Published: (2024)
by: Chen, Chaofeng, et al.
Published: (2024)
GroundingGPT:Language Enhanced Multi-modal Grounding Model
by: Li, Zhaowei, et al.
Published: (2024)
by: Li, Zhaowei, et al.
Published: (2024)
Safeguarding Facial Identity against Diffusion-based Face Swapping via Cascading Pathway Disruption
by: Wang, Liqin, et al.
Published: (2026)
by: Wang, Liqin, et al.
Published: (2026)
GLCF: A Global-Local Multimodal Coherence Analysis Framework for Talking Face Generation Detection
by: Chen, Xiaocan, et al.
Published: (2024)
by: Chen, Xiaocan, et al.
Published: (2024)
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors
by: Jin, Sheng, et al.
Published: (2024)
by: Jin, Sheng, et al.
Published: (2024)
Cross-modal Full-mode Fine-grained Alignment for Text-to-Image Person Retrieval
by: Yin, Hao, et al.
Published: (2025)
by: Yin, Hao, et al.
Published: (2025)
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding
by: Xiao, Linhui, et al.
Published: (2024)
by: Xiao, Linhui, et al.
Published: (2024)
Self-paced Multi-grained Cross-modal Interaction Modeling for Referring Expression Comprehension
by: Miao, Peihan, et al.
Published: (2022)
by: Miao, Peihan, et al.
Published: (2022)
Unleashing the Potential of Consistency Learning for Detecting and Grounding Multi-Modal Media Manipulation
by: Li, Yiheng, et al.
Published: (2025)
by: Li, Yiheng, et al.
Published: (2025)
Fast-then-Fine: A Two-Stage Framework with Multi-Granular Representation for Cross-Modal Retrieval in Remote Sensing
by: Chen, Xi, et al.
Published: (2026)
by: Chen, Xi, et al.
Published: (2026)
AU-Blendshape for Fine-grained Stylized 3D Facial Expression Manipulation
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
Visual Grounding with Multi-modal Conditional Adaptation
by: Yao, Ruilin, et al.
Published: (2024)
by: Yao, Ruilin, et al.
Published: (2024)
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
by: Wang, Haibo, et al.
Published: (2024)
by: Wang, Haibo, et al.
Published: (2024)
UB-FineNet: Urban Building Fine-grained Classification Network for Open-access Satellite Images
by: He, Zhiyi, et al.
Published: (2024)
by: He, Zhiyi, et al.
Published: (2024)
Similar Items
-
CIEC: Coupling Implicit and Explicit Cues for Multimodal Weakly Supervised Manipulation Localization
by: Yu, Xinquan, et al.
Published: (2026) -
RaCMC: Residual-Aware Compensation Network with Multi-Granularity Constraints for Fake News Detection
by: Yu, Xinquan, et al.
Published: (2024) -
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception
by: He, Junwen, et al.
Published: (2024) -
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection
by: Zhang, Yaning, et al.
Published: (2024) -
Multi-modal Reference Learning for Fine-grained Text-to-Image Retrieval
by: Ma, Zehong, et al.
Published: (2025)