Saved in:
| Main Authors: | Kang, Taewon, Kothandaraman, Divya, Lin, Ming C. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.06310 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
3D-free meets 3D priors: Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance
by: Kang, Taewon, et al.
Published: (2024)
by: Kang, Taewon, et al.
Published: (2024)
Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition
by: Kothandaraman, Divya, et al.
Published: (2022)
by: Kothandaraman, Divya, et al.
Published: (2022)
Character-Centered Dialogue Generation from Scene-Level Prompts
by: Kang, Taewon, et al.
Published: (2025)
by: Kang, Taewon, et al.
Published: (2025)
NEGATE: Constrained Semantic Guidance for Linguistic Negation in Text-to-Video Diffusion
by: Kang, Taewon, et al.
Published: (2026)
by: Kang, Taewon, et al.
Published: (2026)
Financial Models in Generative Art: Black-Scholes-Inspired Concept Blending in Text-to-Image Diffusion
by: Kothandaraman, Divya, et al.
Published: (2024)
by: Kothandaraman, Divya, et al.
Published: (2024)
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
by: Kothandaraman, Divya, et al.
Published: (2024)
by: Kothandaraman, Divya, et al.
Published: (2024)
HawkI: Homography & Mutual Information Guidance for 3D-free Single Image to Aerial View
by: Kothandaraman, Divya, et al.
Published: (2023)
by: Kothandaraman, Divya, et al.
Published: (2023)
ImPoster: Text and Frequency Guidance for Subject Driven Action Personalization using Diffusion Models
by: Kothandaraman, Divya, et al.
Published: (2024)
by: Kothandaraman, Divya, et al.
Published: (2024)
BoMuDANet: Unsupervised Adaptation for Visual Scene Understanding in Unstructured Driving Environments
by: Kothandaraman, Divya, et al.
Published: (2020)
by: Kothandaraman, Divya, et al.
Published: (2020)
Beyond Memorization: Selective Learning for Copyright-Safe Diffusion Model Training
by: Kothandaraman, Divya, et al.
Published: (2025)
by: Kothandaraman, Divya, et al.
Published: (2025)
Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes
by: Mullen Jr, James F., et al.
Published: (2022)
by: Mullen Jr, James F., et al.
Published: (2022)
SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments
by: Kothandaraman, Divya, et al.
Published: (2020)
by: Kothandaraman, Divya, et al.
Published: (2020)
Zero-Shot Personalized Camera Motion Control for Image-to-Video Synthesis
by: Guhan, Pooja, et al.
Published: (2025)
by: Guhan, Pooja, et al.
Published: (2025)
Trajectory-Guided Diffusion for Foreground-Preserving Background Generation in Multi-Layer Documents
by: Kang, Taewon
Published: (2026)
by: Kang, Taewon
Published: (2026)
RegionRoute: Regional Style Transfer with Diffusion Model
by: Chen, Bowen, et al.
Published: (2026)
by: Chen, Bowen, et al.
Published: (2026)
Low-Bitrate Video Compression through Semantic-Conditioned Diffusion
by: Wang, Lingdong, et al.
Published: (2025)
by: Wang, Lingdong, et al.
Published: (2025)
Text-Conditioned Background Generation for Editable Multi-Layer Documents
by: Kang, Taewon, et al.
Published: (2025)
by: Kang, Taewon, et al.
Published: (2025)
DCR: Counterfactual Attractor Guidance for Rare Compositional Generation
by: Kang, Taewon, et al.
Published: (2026)
by: Kang, Taewon, et al.
Published: (2026)
HART: Human Aligned Reconstruction Transformer
by: Chen, Xiyi, et al.
Published: (2025)
by: Chen, Xiyi, et al.
Published: (2025)
SALAD: Source-free Active Label-Agnostic Domain Adaptation for Classification, Segmentation and Detection
by: Kothandaraman, Divya, et al.
Published: (2022)
by: Kothandaraman, Divya, et al.
Published: (2022)
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt
by: Lin, Xingtao, et al.
Published: (2024)
by: Lin, Xingtao, et al.
Published: (2024)
ActionVOS: Actions as Prompts for Video Object Segmentation
by: Ouyang, Liangyang, et al.
Published: (2024)
by: Ouyang, Liangyang, et al.
Published: (2024)
StoryMem: Multi-shot Long Video Storytelling with Memory
by: Zhang, Kaiwen, et al.
Published: (2025)
by: Zhang, Kaiwen, et al.
Published: (2025)
The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective
by: Shin, Andrew, et al.
Published: (2024)
by: Shin, Andrew, et al.
Published: (2024)
3D Scene Prompting for Scene-Consistent Camera-Controllable Video Generation
by: Lee, JoungBin, et al.
Published: (2025)
by: Lee, JoungBin, et al.
Published: (2025)
Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts
by: Fang, Shuangkang, et al.
Published: (2024)
by: Fang, Shuangkang, et al.
Published: (2024)
DEVIAS: Learning Disentangled Video Representations of Action and Scene
by: Bae, Kyungho, et al.
Published: (2023)
by: Bae, Kyungho, et al.
Published: (2023)
SceneEval: Evaluating Semantic Coherence in Text-Conditioned 3D Indoor Scene Synthesis
by: Tam, Hou In Ivan, et al.
Published: (2025)
by: Tam, Hou In Ivan, et al.
Published: (2025)
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes
by: Liu, Jinxiu, et al.
Published: (2024)
by: Liu, Jinxiu, et al.
Published: (2024)
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
by: Ruan, Penghui, et al.
Published: (2024)
by: Ruan, Penghui, et al.
Published: (2024)
Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion
by: Zhou, Zhenghong, et al.
Published: (2026)
by: Zhou, Zhenghong, et al.
Published: (2026)
Precise Action-to-Video Generation Through Visual Action Prompts
by: Wang, Yuang, et al.
Published: (2025)
by: Wang, Yuang, et al.
Published: (2025)
Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene
by: Lim, Donggeun, et al.
Published: (2025)
by: Lim, Donggeun, et al.
Published: (2025)
Coherent 3D Portrait Video Reconstruction via Triplane Fusion
by: Wang, Shengze, et al.
Published: (2024)
by: Wang, Shengze, et al.
Published: (2024)
CMFN: Cross-Modal Fusion Network for Irregular Scene Text Recognition
by: Zheng, Jinzhi, et al.
Published: (2024)
by: Zheng, Jinzhi, et al.
Published: (2024)
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts
by: Zhuang, Jingyu, et al.
Published: (2024)
by: Zhuang, Jingyu, et al.
Published: (2024)
Action Reimagined: Text-to-Pose Video Editing for Dynamic Human Actions
by: Wang, Lan, et al.
Published: (2024)
by: Wang, Lan, et al.
Published: (2024)
CI-VID: A Coherent Interleaved Text-Video Dataset
by: Ju, Yiming, et al.
Published: (2025)
by: Ju, Yiming, et al.
Published: (2025)
Scene-Text Grounding for Text-Based Video Question Answering
by: Zhou, Sheng, et al.
Published: (2024)
by: Zhou, Sheng, et al.
Published: (2024)
Action-Guided Attention for Video Action Anticipation
by: Tai, Tsung-Ming, et al.
Published: (2026)
by: Tai, Tsung-Ming, et al.
Published: (2026)
Similar Items
-
3D-free meets 3D priors: Novel View Synthesis from a Single Image with Pretrained Diffusion Guidance
by: Kang, Taewon, et al.
Published: (2024) -
Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition
by: Kothandaraman, Divya, et al.
Published: (2022) -
Character-Centered Dialogue Generation from Scene-Level Prompts
by: Kang, Taewon, et al.
Published: (2025) -
NEGATE: Constrained Semantic Guidance for Linguistic Negation in Text-to-Video Diffusion
by: Kang, Taewon, et al.
Published: (2026) -
Financial Models in Generative Art: Black-Scholes-Inspired Concept Blending in Text-to-Image Diffusion
by: Kothandaraman, Divya, et al.
Published: (2024)