Saved in:
| Main Authors: | Ruan, Bo-Kai, Ni, Zi-Xiang, Huang, Bo-Lun, Hsiao, Teng-Fang, Shuai, Hong-Han |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.20808 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PromptMoG: Enhancing Diversity in Long-Prompt Image Generation via Prompt Embedding Mixture-of-Gaussian Sampling
by: Ruan, Bo-Kai, et al.
Published: (2025)
by: Ruan, Bo-Kai, et al.
Published: (2025)
Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References
by: Hsiao, Teng-Fang, et al.
Published: (2024)
by: Hsiao, Teng-Fang, et al.
Published: (2024)
VecSet-Edit: Unleashing Pre-trained LRM for Mesh Editing from Single Image
by: Hsiao, Teng-Fang, et al.
Published: (2026)
by: Hsiao, Teng-Fang, et al.
Published: (2026)
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
by: Hsiao, Teng-Fang, et al.
Published: (2025)
by: Hsiao, Teng-Fang, et al.
Published: (2025)
Is the Future Compatible? Diagnosing Dynamic Consistency in World Action Models
by: Ruan, Bo-Kai, et al.
Published: (2026)
by: Ruan, Bo-Kai, et al.
Published: (2026)
FreeCond: Free Lunch in the Input Conditions of Text-Guided Inpainting
by: Hsiao, Teng-Fang, et al.
Published: (2024)
by: Hsiao, Teng-Fang, et al.
Published: (2024)
MAD: Makeup All-in-One with Cross-Domain Diffusion Model
by: Ruan, Bo-Kai, et al.
Published: (2025)
by: Ruan, Bo-Kai, et al.
Published: (2025)
Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback
by: Wu, Yi-Lun, et al.
Published: (2025)
by: Wu, Yi-Lun, et al.
Published: (2025)
Color Me Correctly: Bridging Perceptual Color Spaces and Text Embeddings for Improved Diffusion Generation
by: Tsai, Sung-Lin, et al.
Published: (2025)
by: Tsai, Sung-Lin, et al.
Published: (2025)
PRISM: Streaming Human Motion Generation with Per-Joint Latent Decomposition
by: Ling, Zeyu, et al.
Published: (2026)
by: Ling, Zeyu, et al.
Published: (2026)
Precise Pick-and-Place using Score-Based Diffusion Networks
by: Guo, Shih-Wei, et al.
Published: (2024)
by: Guo, Shih-Wei, et al.
Published: (2024)
DevPrompt: Deviation-Based Prompt Learning for One-Normal ShotImage Anomaly Detection
by: Poudineh, Morteza, et al.
Published: (2026)
by: Poudineh, Morteza, et al.
Published: (2026)
HGFreNet: Hop-hybrid GraphFomer for 3D Human Pose Estimation with Trajectory Consistency in Frequency Domain
by: Zhai, Kai, et al.
Published: (2025)
by: Zhai, Kai, et al.
Published: (2025)
The Fabrication of Reality and Fantasy: Scene Generation with LLM-Assisted Prompt Interpretation
by: Yao, Yi, et al.
Published: (2024)
by: Yao, Yi, et al.
Published: (2024)
ANYPORTAL: Zero-Shot Consistent Video Background Replacement
by: Gao, Wenshuo, et al.
Published: (2025)
by: Gao, Wenshuo, et al.
Published: (2025)
HumanScore: Benchmarking Human Motions in Generated Videos
by: Fang, Yusu, et al.
Published: (2026)
by: Fang, Yusu, et al.
Published: (2026)
Domain Generalization for Face Anti-spoofing via Content-aware Composite Prompt Engineering
by: Guo, Jiabao, et al.
Published: (2025)
by: Guo, Jiabao, et al.
Published: (2025)
Dual-View Alignment Learning with Hierarchical-Prompt for Class-Imbalance Multi-Label Classification
by: Huang, Sheng, et al.
Published: (2025)
by: Huang, Sheng, et al.
Published: (2025)
DCDet: Dynamic Cross-based 3D Object Detector
by: Liu, Shuai, et al.
Published: (2024)
by: Liu, Shuai, et al.
Published: (2024)
When Safety Collides: Resolving Multi-Category Harmful Conflicts in Text-to-Image Diffusion via Adaptive Safety Guidance
by: Xiang, Yongli, et al.
Published: (2026)
by: Xiang, Yongli, et al.
Published: (2026)
CoDoL: Conditional Domain Prompt Learning for Out-of-Distribution Generalization
by: Zhang, Min, et al.
Published: (2025)
by: Zhang, Min, et al.
Published: (2025)
PianoFlow: Music-Aware Streaming Piano Motion Generation with Bimanual Coordination
by: Wang, Xuan, et al.
Published: (2026)
by: Wang, Xuan, et al.
Published: (2026)
Detecting AI-Generated Forgeries via Iterative Manifold Deviation Amplification
by: Zhang, Jiangling, et al.
Published: (2026)
by: Zhang, Jiangling, et al.
Published: (2026)
Towards Generalized Image Manipulation Localization via Score-based Model
by: Wang, Yunfei, et al.
Published: (2026)
by: Wang, Yunfei, et al.
Published: (2026)
Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields
by: Cheng, Bo-Yu, et al.
Published: (2024)
by: Cheng, Bo-Yu, et al.
Published: (2024)
VISTA: Validation-Guided Integration of Spatial and Temporal Foundation Models with Anatomical Decoding for Rare-Pathology VCE Event Detection -- after competition results
by: Qiu, Bo-Cheng, et al.
Published: (2026)
by: Qiu, Bo-Cheng, et al.
Published: (2026)
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
by: Gao, Ruiyuan, et al.
Published: (2024)
by: Gao, Ruiyuan, et al.
Published: (2024)
MPF-Net: Exposing High-Fidelity AI-Generated Video Forgeries via Hierarchical Manifold Deviation and Micro-Temporal Fluctuations
by: He, Xinan, et al.
Published: (2026)
by: He, Xinan, et al.
Published: (2026)
CaMML: Context-Aware Multimodal Learner for Large Models
by: Chen, Yixin, et al.
Published: (2024)
by: Chen, Yixin, et al.
Published: (2024)
Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition
by: Li, Ping, et al.
Published: (2025)
by: Li, Ping, et al.
Published: (2025)
Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
by: Yang, Huan, et al.
Published: (2024)
by: Yang, Huan, et al.
Published: (2024)
Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image Classification
by: Yan, Jiexuan, et al.
Published: (2024)
by: Yan, Jiexuan, et al.
Published: (2024)
Generative World Renderer
by: Huang, Zheng-Hui, et al.
Published: (2026)
by: Huang, Zheng-Hui, et al.
Published: (2026)
COACH: Collaborative Agents for Contextual Highlighting -- A Multi-Agent Framework for Sports Video Analysis
by: Wong, Tsz-To, et al.
Published: (2025)
by: Wong, Tsz-To, et al.
Published: (2025)
Replace Anyone in Videos
by: Wang, Xiang, et al.
Published: (2024)
by: Wang, Xiang, et al.
Published: (2024)
Harmonizing and Merging Source Models for CLIP-based Domain Generalization
by: Ding, Yuhe, et al.
Published: (2025)
by: Ding, Yuhe, et al.
Published: (2025)
Beyond Full Labels: Energy-Double-Guided Single-Point Prompt for Infrared Small Target Label Generation
by: Yuan, Shuai, et al.
Published: (2024)
by: Yuan, Shuai, et al.
Published: (2024)
FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors
by: Liu, Shuai, et al.
Published: (2024)
by: Liu, Shuai, et al.
Published: (2024)
VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
by: Cheng, Silin, et al.
Published: (2025)
by: Cheng, Silin, et al.
Published: (2025)
Interpretable Rheumatoid Arthritis Scoring via Anatomy-aware Multiple Instance Learning
by: Bo, Zhiyan, et al.
Published: (2025)
by: Bo, Zhiyan, et al.
Published: (2025)
Similar Items
-
PromptMoG: Enhancing Diversity in Long-Prompt Image Generation via Prompt Embedding Mixture-of-Gaussian Sampling
by: Ruan, Bo-Kai, et al.
Published: (2025) -
Training-and-Prompt-Free General Painterly Harmonization via Zero-Shot Disentenglement on Style and Content References
by: Hsiao, Teng-Fang, et al.
Published: (2024) -
VecSet-Edit: Unleashing Pre-trained LRM for Mesh Editing from Single Image
by: Hsiao, Teng-Fang, et al.
Published: (2026) -
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
by: Hsiao, Teng-Fang, et al.
Published: (2025) -
Is the Future Compatible? Diagnosing Dynamic Consistency in World Action Models
by: Ruan, Bo-Kai, et al.
Published: (2026)