Saved in:
| Main Authors: | Yan, Weicai, Ma, Xinhua, Lin, Wang, Jin, Tao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.08181 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient Prompting for Continual Adaptation to Missing Modalities
by: Guo, Zirun, et al.
Published: (2025)
by: Guo, Zirun, et al.
Published: (2025)
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
by: Liu, Yuti, et al.
Published: (2024)
by: Liu, Yuti, et al.
Published: (2024)
Representation Surgery for Multi-Task Model Merging
by: Yang, Enneng, et al.
Published: (2024)
by: Yang, Enneng, et al.
Published: (2024)
Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution
by: Wang, Ying, et al.
Published: (2023)
by: Wang, Ying, et al.
Published: (2023)
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
by: Yang, Yaoxin, et al.
Published: (2025)
by: Yang, Yaoxin, et al.
Published: (2025)
M4V: Multi-Modal Mamba for Text-to-Video Generation
by: Huang, Jiancheng, et al.
Published: (2025)
by: Huang, Jiancheng, et al.
Published: (2025)
Joint Memory Frequency and Computing Frequency Scaling for Energy-efficient DNN Inference
by: Han, Yunchu, et al.
Published: (2025)
by: Han, Yunchu, et al.
Published: (2025)
CycleNet: Rethinking Cycle Consistency in Text-Guided Diffusion for Image Manipulation
by: Xu, Sihan, et al.
Published: (2023)
by: Xu, Sihan, et al.
Published: (2023)
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation
by: Xie, Jingjing, et al.
Published: (2024)
by: Xie, Jingjing, et al.
Published: (2024)
Uncertainty-Guided Selective Adaptation Enables Cross-Platform Predictive Fluorescence Microscopy
by: Yang, Kai-Wen K., et al.
Published: (2025)
by: Yang, Kai-Wen K., et al.
Published: (2025)
Text-to-Image GAN with Pretrained Representations
by: You, Xiaozhou, et al.
Published: (2024)
by: You, Xiaozhou, et al.
Published: (2024)
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
by: Zhu, Haoyi, et al.
Published: (2024)
by: Zhu, Haoyi, et al.
Published: (2024)
Integrating Frequency Guidance into Multi-source Domain Generalization for Bearing Fault Diagnosis
by: Tu, Xiaotong, et al.
Published: (2025)
by: Tu, Xiaotong, et al.
Published: (2025)
Scaling 4D Representations
by: Carreira, João, et al.
Published: (2024)
by: Carreira, João, et al.
Published: (2024)
DohaScript: A Large-Scale Multi-Writer Dataset for Continuous Handwritten Hindi Text
by: Singh, Kunwar Arpit, et al.
Published: (2026)
by: Singh, Kunwar Arpit, et al.
Published: (2026)
Decoupling Amplitude and Phase Attention in Frequency Domain for RGB-Event based Visual Object Tracking
by: Wang, Shiao, et al.
Published: (2026)
by: Wang, Shiao, et al.
Published: (2026)
Semantically Guided Representation Learning For Action Anticipation
by: Diko, Anxhelo, et al.
Published: (2024)
by: Diko, Anxhelo, et al.
Published: (2024)
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery
by: Yang, Enneng, et al.
Published: (2024)
by: Yang, Enneng, et al.
Published: (2024)
MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs
by: Mao, Jiawei, et al.
Published: (2025)
by: Mao, Jiawei, et al.
Published: (2025)
Invariant Representation Guided Multimodal Sentiment Decoding with Sequential Variation Regularization
by: Xu, Guoyang, et al.
Published: (2024)
by: Xu, Guoyang, et al.
Published: (2024)
Compositional Text-to-Image Generation with Dense Blob Representations
by: Nie, Weili, et al.
Published: (2024)
by: Nie, Weili, et al.
Published: (2024)
Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation
by: Chopra, Shivang, et al.
Published: (2024)
by: Chopra, Shivang, et al.
Published: (2024)
MFAF: An EVA02-Based Multi-scale Frequency Attention Fusion Method for Cross-View Geo-Localization
by: Liu, YiTong, et al.
Published: (2025)
by: Liu, YiTong, et al.
Published: (2025)
Implicit Contrastive Representation Learning with Guided Stop-gradient
by: Lee, Byeongchan, et al.
Published: (2025)
by: Lee, Byeongchan, et al.
Published: (2025)
Superclass-Guided Representation Disentanglement for Spurious Correlation Mitigation
by: Liu, Chenruo, et al.
Published: (2025)
by: Liu, Chenruo, et al.
Published: (2025)
ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
by: Petrov, Dmitry, et al.
Published: (2024)
by: Petrov, Dmitry, et al.
Published: (2024)
Uncertainty Quantification via Hölder Divergence for Multi-View Representation Learning
by: Zhang, Yan, et al.
Published: (2024)
by: Zhang, Yan, et al.
Published: (2024)
Consistent Flow Distillation for Text-to-3D Generation
by: Yan, Runjie, et al.
Published: (2025)
by: Yan, Runjie, et al.
Published: (2025)
SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules
by: Chen, Xiangyu, et al.
Published: (2024)
by: Chen, Xiangyu, et al.
Published: (2024)
Contextualized Diffusion Models for Text-Guided Image and Video Generation
by: Yang, Ling, et al.
Published: (2024)
by: Yang, Ling, et al.
Published: (2024)
DenseTRF: Texture-Aware Unsupervised Representation Adaptation for Surgical Scene Dense Prediction
by: Liao, Guiqiu, et al.
Published: (2026)
by: Liao, Guiqiu, et al.
Published: (2026)
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
by: Liu, Jiacheng, et al.
Published: (2025)
by: Liu, Jiacheng, et al.
Published: (2025)
Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation
by: Yan, HongWei, et al.
Published: (2024)
by: Yan, HongWei, et al.
Published: (2024)
Exploring Text-to-Motion Generation with Human Preference
by: Sheng, Jenny, et al.
Published: (2024)
by: Sheng, Jenny, et al.
Published: (2024)
Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model
by: Ma, Huan, et al.
Published: (2024)
by: Ma, Huan, et al.
Published: (2024)
Alignment-Guided Score Matching for Text-to-Image Alignment in Diffusion Models
by: Lee, Jaa-Yeon, et al.
Published: (2026)
by: Lee, Jaa-Yeon, et al.
Published: (2026)
FACL-Attack: Frequency-Aware Contrastive Learning for Transferable Adversarial Attacks
by: Yang, Hunmin, et al.
Published: (2024)
by: Yang, Hunmin, et al.
Published: (2024)
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
by: Yosef, Ron, et al.
Published: (2025)
by: Yosef, Ron, et al.
Published: (2025)
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
by: Oertell, Owen, et al.
Published: (2024)
by: Oertell, Owen, et al.
Published: (2024)
MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings
by: Li, Zijie, et al.
Published: (2026)
by: Li, Zijie, et al.
Published: (2026)
Similar Items
-
Efficient Prompting for Continual Adaptation to Missing Modalities
by: Guo, Zirun, et al.
Published: (2025) -
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
by: Liu, Yuti, et al.
Published: (2024) -
Representation Surgery for Multi-Task Model Merging
by: Yang, Enneng, et al.
Published: (2024) -
Visual Explanations of Image-Text Representations via Multi-Modal Information Bottleneck Attribution
by: Wang, Ying, et al.
Published: (2023) -
Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach
by: Yang, Yaoxin, et al.
Published: (2025)