Saved in:
| Main Authors: | Bandyopadhyay, Hmrishav, Pinnaparaju, Nikhil, Entezari, Rahim, Scott, Jim, Song, Yi-Zhe, Jampani, Varun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.20426 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SD3.5-Flash: Distribution-Guided Distillation of Generative Flows
by: Bandyopadhyay, Hmrishav, et al.
Published: (2025)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2025)
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
by: Chen, Dar-Yen, et al.
Published: (2024)
by: Chen, Dar-Yen, et al.
Published: (2024)
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
by: Chen, Dar-Yen, et al.
Published: (2025)
by: Chen, Dar-Yen, et al.
Published: (2025)
What Sketch Explainability Really Means for Downstream Tasks
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
Stable Cinemetrics : Structured Taxonomy and Evaluation for Professional Video Generation
by: Chatterjee, Agneet, et al.
Published: (2025)
by: Chatterjee, Agneet, et al.
Published: (2025)
Do Generalised Classifiers really work on Human Drawn Sketches?
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
SketchINR: A First Look into Sketches as Implicit Neural Representations
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)
Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes
by: Bandyopadhyay, Hmrishav, et al.
Published: (2023)
by: Bandyopadhyay, Hmrishav, et al.
Published: (2023)
HumANDiff: Articulated Noise Diffusion for Motion-Consistent Human Video Generation
by: Hu, Tao, et al.
Published: (2026)
by: Hu, Tao, et al.
Published: (2026)
BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching
by: Cui, Hanshuai, et al.
Published: (2025)
by: Cui, Hanshuai, et al.
Published: (2025)
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
by: Chen, Junsong, et al.
Published: (2025)
by: Chen, Junsong, et al.
Published: (2025)
Incremental Open-set Domain Adaptation
by: Rakshit, Sayan, et al.
Published: (2024)
by: Rakshit, Sayan, et al.
Published: (2024)
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling
by: Ren, Shuhuai, et al.
Published: (2025)
by: Ren, Shuhuai, et al.
Published: (2025)
FAIRT2V: Training-Free Debiasing for Text-to-Video Diffusion Models
by: Zhong, Haonan, et al.
Published: (2026)
by: Zhong, Haonan, et al.
Published: (2026)
Computational Tradeoffs in Image Synthesis: Diffusion, Masked-Token, and Next-Token Prediction
by: Kilian, Maciej, et al.
Published: (2024)
by: Kilian, Maciej, et al.
Published: (2024)
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
by: Jang, Sangwon, et al.
Published: (2025)
by: Jang, Sangwon, et al.
Published: (2025)
Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
by: Lee, Jemin, et al.
Published: (2023)
by: Lee, Jemin, et al.
Published: (2023)
CorGi: Contribution-Guided Block-Wise Interval Caching for Training-Free Acceleration of Diffusion Transformers
by: Son, Yonglak, et al.
Published: (2025)
by: Son, Yonglak, et al.
Published: (2025)
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
by: Hyun, Jeongseok, et al.
Published: (2025)
by: Hyun, Jeongseok, et al.
Published: (2025)
Foley Control: Aligning a Frozen Latent Text-to-Audio Model to Video
by: Rowles, Ciara, et al.
Published: (2025)
by: Rowles, Ciara, et al.
Published: (2025)
FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion
by: Chen, Zhuokun, et al.
Published: (2026)
by: Chen, Zhuokun, et al.
Published: (2026)
HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention
by: Zheng, Xuzhe, et al.
Published: (2026)
by: Zheng, Xuzhe, et al.
Published: (2026)
Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation
by: Zhang, Hao, et al.
Published: (2025)
by: Zhang, Hao, et al.
Published: (2025)
Blocks as Probes: Dissecting Categorization Ability of Large Multimodal Models
by: Fu, Bin, et al.
Published: (2024)
by: Fu, Bin, et al.
Published: (2024)
Prism: Spectral-Aware Block-Sparse Attention
by: Wang, Xinghao, et al.
Published: (2026)
by: Wang, Xinghao, et al.
Published: (2026)
Unified Dense Prediction of Video Diffusion
by: Yang, Lehan, et al.
Published: (2025)
by: Yang, Lehan, et al.
Published: (2025)
SketchDeco: Training-Free Latent Composition for Precise Sketch Colourisation
by: Utintu, Chaitat, et al.
Published: (2024)
by: Utintu, Chaitat, et al.
Published: (2024)
SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation
by: Yao, Chun-Han, et al.
Published: (2025)
by: Yao, Chun-Han, et al.
Published: (2025)
PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly
by: Ma, Liang, et al.
Published: (2025)
by: Ma, Liang, et al.
Published: (2025)
MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation
by: Hu, Hanzhe, et al.
Published: (2024)
by: Hu, Hanzhe, et al.
Published: (2024)
ICE-G: Image Conditional Editing of 3D Gaussian Splats
by: Jaganathan, Vishnu, et al.
Published: (2024)
by: Jaganathan, Vishnu, et al.
Published: (2024)
BLADE: Block-Sparse Attention Meets Step Distillation for Efficient Video Generation
by: Gu, Youping, et al.
Published: (2025)
by: Gu, Youping, et al.
Published: (2025)
HouseCrafter: Lifting Floorplans to 3D Scenes with 2D Diffusion Model
by: Nguyen, Hieu T., et al.
Published: (2024)
by: Nguyen, Hieu T., et al.
Published: (2024)
PhysRig: Differentiable Physics-Based Skinning and Rigging Framework for Realistic Articulated Object Modeling
by: Zhang, Hao, et al.
Published: (2025)
by: Zhang, Hao, et al.
Published: (2025)
FreeVA: Offline MLLM as Training-Free Video Assistant
by: Wu, Wenhao
Published: (2024)
by: Wu, Wenhao
Published: (2024)
Sparser Block-Sparse Attention via Token Permutation
by: Wang, Xinghao, et al.
Published: (2025)
by: Wang, Xinghao, et al.
Published: (2025)
Stable Video-Driven Portraits
by: R., Mallikarjun B., et al.
Published: (2025)
by: R., Mallikarjun B., et al.
Published: (2025)
Where is the Watermark? Interpretable Watermark Detection at the Block Level
by: Bulychev, Maria, et al.
Published: (2025)
by: Bulychev, Maria, et al.
Published: (2025)
Prefix-Adaptive Block Diffusion for Efficient Document Recognition
by: Chai, Mingxu, et al.
Published: (2026)
by: Chai, Mingxu, et al.
Published: (2026)
Similar Items
-
SD3.5-Flash: Distribution-Guided Distillation of Generative Flows
by: Bandyopadhyay, Hmrishav, et al.
Published: (2025) -
FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024) -
NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training
by: Chen, Dar-Yen, et al.
Published: (2024) -
Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models
by: Chen, Dar-Yen, et al.
Published: (2025) -
What Sketch Explainability Really Means for Downstream Tasks
by: Bandyopadhyay, Hmrishav, et al.
Published: (2024)