Saved in:
| Main Authors: | Huang, Lang, Wu, Qiyu, Miao, Zhongtao, Yamasaki, Toshihiko |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.20008 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CE-FAM: Concept-Based Explanation via Fusion of Activation Maps
by: Kuroki, Michihiro, et al.
Published: (2025)
by: Kuroki, Michihiro, et al.
Published: (2025)
A Multihead Continual Learning Framework for Fine-Grained Fashion Image Retrieval with Contrastive Learning and Exponential Moving Average Distillation
by: Xiao, Ling, et al.
Published: (2026)
by: Xiao, Ling, et al.
Published: (2026)
Mirai: Autoregressive Visual Generation Needs Foresight
by: Yu, Yonghao, et al.
Published: (2026)
by: Yu, Yonghao, et al.
Published: (2026)
Bias Beyond Demographics: Probing Decision Boundaries in Black-Box LVLMs via Counterfactual VQA
by: Zhao, Zaiying, et al.
Published: (2025)
by: Zhao, Zaiying, et al.
Published: (2025)
Unified Vector Floorplan Generation via Markup Representation
by: Shiohara, Kaede, et al.
Published: (2026)
by: Shiohara, Kaede, et al.
Published: (2026)
BSED: Baseline Shapley-Based Explainable Detector
by: Kuroki, Michihiro, et al.
Published: (2023)
by: Kuroki, Michihiro, et al.
Published: (2023)
Face2Diffusion for Fast and Editable Face Personalization
by: Shiohara, Kaede, et al.
Published: (2024)
by: Shiohara, Kaede, et al.
Published: (2024)
Attribute-Guided Multi-Level Attention Network for Fine-Grained Fashion Retrieval
by: Xiao, Ling, et al.
Published: (2022)
by: Xiao, Ling, et al.
Published: (2022)
Iterative Self-Improvement of Vision Language Models for Image Scoring and Self-Explanation
by: Tanji, Naoto, et al.
Published: (2025)
by: Tanji, Naoto, et al.
Published: (2025)
SCOMatch: Alleviating Overtrusting in Open-set Semi-supervised Learning
by: Wang, Zerun, et al.
Published: (2024)
by: Wang, Zerun, et al.
Published: (2024)
ControlVP: Interactive Geometric Refinement of AI-Generated Images with Consistent Vanishing Points
by: Okumura, Ryota, et al.
Published: (2025)
by: Okumura, Ryota, et al.
Published: (2025)
ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors
by: Shiohara, Kaede, et al.
Published: (2026)
by: Shiohara, Kaede, et al.
Published: (2026)
Language-guided Detection and Mitigation of Unknown Dataset Bias
by: Zhao, Zaiying, et al.
Published: (2024)
by: Zhao, Zaiying, et al.
Published: (2024)
Adversarial Training from Mean Field Perspective
by: Kumano, Soichiro, et al.
Published: (2025)
by: Kumano, Soichiro, et al.
Published: (2025)
Theoretical Understanding of Learning from Adversarial Perturbations
by: Kumano, Soichiro, et al.
Published: (2024)
by: Kumano, Soichiro, et al.
Published: (2024)
Wide Two-Layer Networks can Learn from Adversarial Perturbations
by: Kumano, Soichiro, et al.
Published: (2024)
by: Kumano, Soichiro, et al.
Published: (2024)
Adversarially Pretrained Transformers May Be Universally Robust In-Context Learners
by: Kumano, Soichiro, et al.
Published: (2025)
by: Kumano, Soichiro, et al.
Published: (2025)
Auto-Comp: An Automated Pipeline for Scalable Compositional Probing of Contrastive Vision-Language Models
by: Sbrolli, Cristian, et al.
Published: (2026)
by: Sbrolli, Cristian, et al.
Published: (2026)
Difficulty Controlled Diffusion Model for Synthesizing Effective Training Data
by: Wang, Zerun, et al.
Published: (2024)
by: Wang, Zerun, et al.
Published: (2024)
From Obstacles to Resources: Semi-supervised Learning Faces Synthetic Data Contamination
by: Wang, Zerun, et al.
Published: (2024)
by: Wang, Zerun, et al.
Published: (2024)
Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video
by: Sugihara, Tomoya, et al.
Published: (2024)
by: Sugihara, Tomoya, et al.
Published: (2024)
Continual Distillation of Teachers from Different Domains
by: Michel, Nicolas, et al.
Published: (2026)
by: Michel, Nicolas, et al.
Published: (2026)
Online Open-set Semi-supervised Object Detection with Dual Competing Head
by: Wang, Zerun, et al.
Published: (2023)
by: Wang, Zerun, et al.
Published: (2023)
Spectral Probing of Feature Upsamplers in 2D-to-3D Scene Reconstruction
by: Xiao, Ling, et al.
Published: (2026)
by: Xiao, Ling, et al.
Published: (2026)
Reward Incremental Learning in Text-to-Image Generation
by: Wang, Maorong, et al.
Published: (2024)
by: Wang, Maorong, et al.
Published: (2024)
Dealing with Synthetic Data Contamination in Online Continual Learning
by: Wang, Maorong, et al.
Published: (2024)
by: Wang, Maorong, et al.
Published: (2024)
Trifuse: Enhancing Attention-Based GUI Grounding via Multimodal Fusion
by: Ma, Longhui, et al.
Published: (2026)
by: Ma, Longhui, et al.
Published: (2026)
Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
Probing Multimodal Fusion in the Brain: The Dominance of Audiovisual Streams in Naturalistic Encoding
by: Abdollahi, Hamid, et al.
Published: (2025)
by: Abdollahi, Hamid, et al.
Published: (2025)
StrokeFusion: Vector Sketch Generation via Joint Stroke-UDF Encoding and Latent Sequence Diffusion
by: Zhou, Jin, et al.
Published: (2025)
by: Zhou, Jin, et al.
Published: (2025)
Data-Driven Prediction of Seismic Intensity Distributions Featuring Hybrid Classification-Regression Models
by: Mizutani, Koyu, et al.
Published: (2024)
by: Mizutani, Koyu, et al.
Published: (2024)
Robust Deepfake Detection for Electronic Know Your Customer Systems Using Registered Images
by: Amada, Takuma, et al.
Published: (2025)
by: Amada, Takuma, et al.
Published: (2025)
SOLAR: Self-supervised Joint Learning for Symmetric Multimodal Retrieval
by: Yang, Wenjie, et al.
Published: (2026)
by: Yang, Wenjie, et al.
Published: (2026)
A Multimodal RAG Framework for Housing Damage Assessment: Collaborative Optimization of Image Encoding and Policy Vector Retrieval
by: Miao, Jiayi, et al.
Published: (2025)
by: Miao, Jiayi, et al.
Published: (2025)
Dual-task Mutual Reinforcing Embedded Joint Video Paragraph Retrieval and Grounding
by: Wang, Mengzhao, et al.
Published: (2024)
by: Wang, Mengzhao, et al.
Published: (2024)
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
by: Shi, Liangtao, et al.
Published: (2025)
by: Shi, Liangtao, et al.
Published: (2025)
Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation
by: Li, Haocheng, et al.
Published: (2026)
by: Li, Haocheng, et al.
Published: (2026)
PolyGen: Fully Synthetic Vision-Language Training via Multi-Generator Ensembles
by: Brusini, Leonardo, et al.
Published: (2026)
by: Brusini, Leonardo, et al.
Published: (2026)
A Benchmark and Knowledge-Grounded Framework for Advanced Multimodal Personalization Study
by: Hu, Xia, et al.
Published: (2026)
by: Hu, Xia, et al.
Published: (2026)
MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval
by: Xu, Chaoran, et al.
Published: (2026)
by: Xu, Chaoran, et al.
Published: (2026)
Similar Items
-
CE-FAM: Concept-Based Explanation via Fusion of Activation Maps
by: Kuroki, Michihiro, et al.
Published: (2025) -
A Multihead Continual Learning Framework for Fine-Grained Fashion Image Retrieval with Contrastive Learning and Exponential Moving Average Distillation
by: Xiao, Ling, et al.
Published: (2026) -
Mirai: Autoregressive Visual Generation Needs Foresight
by: Yu, Yonghao, et al.
Published: (2026) -
Bias Beyond Demographics: Probing Decision Boundaries in Black-Box LVLMs via Counterfactual VQA
by: Zhao, Zaiying, et al.
Published: (2025) -
Unified Vector Floorplan Generation via Markup Representation
by: Shiohara, Kaede, et al.
Published: (2026)