:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ruan, Bo-Kai, Ni, Zi-Xiang, Huang, Bo-Lun, Hsiao, Teng-Fang, Shuai, Hong-Han
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.20808
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PromptMoG: Enhancing Diversity in Long-Prompt Image Generation via Prompt Embedding Mixture-of-Gaussian Sampling
by: Ruan, Bo-Kai, et al.
Published: (2025)

Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References
by: Hsiao, Teng-Fang, et al.
Published: (2024)

VecSet-Edit: Unleashing Pre-trained LRM for Mesh Editing from Single Image
by: Hsiao, Teng-Fang, et al.
Published: (2026)

TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
by: Hsiao, Teng-Fang, et al.
Published: (2025)

Is the Future Compatible? Diagnosing Dynamic Consistency in World Action Models
by: Ruan, Bo-Kai, et al.
Published: (2026)

FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting
by: Hsiao, Teng-Fang, et al.
Published: (2024)

MAD: Makeup All-in-One with Cross-Domain Diffusion Model
by: Ruan, Bo-Kai, et al.
Published: (2025)

Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
by: Wu, Yi-Lun, et al.
Published: (2025)

Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation
by: Tsai, Sung-Lin, et al.
Published: (2025)

PRISM: Streaming Human Motion Generation with Per-Joint Latent Decomposition
by: Ling, Zeyu, et al.
Published: (2026)

Precise Pick-and-Place using Score-Based Diffusion Networks
by: Guo, Shih-Wei, et al.
Published: (2024)

DevPrompt: Deviation-Based Prompt Learning for One-Normal ShotImage Anomaly Detection
by: Poudineh, Morteza, et al.
Published: (2026)

HGFreNet: Hop-hybrid GraphFomer for 3D Human Pose Estimation with Trajectory Consistency in Frequency Domain
by: Zhai, Kai, et al.
Published: (2025)

The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
by: Yao, Yi, et al.
Published: (2024)

ANYPORTAL: Zero-Shot Consistent Video Background Replacement
by: Gao, Wenshuo, et al.
Published: (2025)

HumanScore: Benchmarking Human Motions in Generated Videos
by: Fang, Yusu, et al.
Published: (2026)

Domain Generalization for Face Anti-spoofing via Content-aware Composite Prompt Engineering
by: Guo, Jiabao, et al.
Published: (2025)

Dual-View Alignment Learning with Hierarchical-Prompt for Class-Imbalance Multi-Label Classification
by: Huang, Sheng, et al.
Published: (2025)

DCDet: Dynamic Cross-based 3D Object Detector
by: Liu, Shuai, et al.
Published: (2024)

When Safety Collides: Resolving Multi-Category Harmful Conflicts in Text-to-Image Diffusion via Adaptive Safety Guidance
by: Xiang, Yongli, et al.
Published: (2026)

CoDoL: Conditional Domain Prompt Learning for Out-of-Distribution Generalization
by: Zhang, Min, et al.
Published: (2025)

PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination
by: Wang, Xuan, et al.
Published: (2026)

Detecting AI-Generated Forgeries via Iterative Manifold Deviation Amplification
by: Zhang, Jiangling, et al.
Published: (2026)

Towards Generalized Image Manipulation Localization via Score-based Model
by: Wang, Yunfei, et al.
Published: (2026)

Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields
by: Cheng, Bo-Yu, et al.
Published: (2024)

VISTA: Validation-Guided Integration of Spatial and Temporal Foundation Models with Anatomical Decoding for Rare-Pathology VCE Event Detection -- after competition results
by: Qiu, Bo-Cheng, et al.
Published: (2026)

MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
by: Gao, Ruiyuan, et al.
Published: (2024)

MPF-Net: Exposing High-Fidelity AI-Generated Video Forgeries via Hierarchical Manifold Deviation and Micro-Temporal Fluctuations
by: He, Xinan, et al.
Published: (2026)

CaMML: Context-Aware Multimodal Learner for Large Models
by: Chen, Yixin, et al.
Published: (2024)

Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition
by: Li, Ping, et al.
Published: (2025)

Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
by: Yang, Huan, et al.
Published: (2024)

Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification
by: Yan, Jiexuan, et al.
Published: (2024)

Generative World Renderer
by: Huang, Zheng-Hui, et al.
Published: (2026)

COACH: Collaborative Agents for Contextual Highlighting -- A Multi-Agent Framework for Sports Video Analysis
by: Wong, Tsz-To, et al.
Published: (2025)

Replace Anyone in Videos
by: Wang, Xiang, et al.
Published: (2024)

Harmonizing and Merging Source Models for CLIP-based Domain Generalization
by: Ding, Yuhe, et al.
Published: (2025)

Beyond Full Labels: Energy-Double-Guided Single-Point Prompt for Infrared Small Target Label Generation
by: Yuan, Shuai, et al.
Published: (2024)

FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors
by: Liu, Shuai, et al.
Published: (2024)

VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
by: Cheng, Silin, et al.
Published: (2025)

Interpretable Rheumatoid Arthritis Scoring via Anatomy-aware Multiple Instance Learning
by: Bo, Zhiyan, et al.
Published: (2025)