Saved in:
| Main Authors: | Adnan, Muhammad, Kurella, Nithesh, Arunkumar, Akhil, Nair, Prashant J. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.00329 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hybrid Quantum-Classical Model for Image Classification
by: Shahzad, Muhammad Adnan
Published: (2025)
by: Shahzad, Muhammad Adnan
Published: (2025)
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
by: Yoon, Jaehong, et al.
Published: (2024)
by: Yoon, Jaehong, et al.
Published: (2024)
DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
by: Munir, Adnan, et al.
Published: (2025)
by: Munir, Adnan, et al.
Published: (2025)
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
by: Seo, Hoigi, et al.
Published: (2025)
by: Seo, Hoigi, et al.
Published: (2025)
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
by: Du, Yilun, et al.
Published: (2023)
by: Du, Yilun, et al.
Published: (2023)
Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
by: Luo, Ge Ya, et al.
Published: (2024)
by: Luo, Ge Ya, et al.
Published: (2024)
Contextualized Diffusion Models for Text-Guided Image and Video Generation
by: Yang, Ling, et al.
Published: (2024)
by: Yang, Ling, et al.
Published: (2024)
LayerT2V: A Unified Multi-Layer Video Generation Framework
by: Li, Guangzhao, et al.
Published: (2025)
by: Li, Guangzhao, et al.
Published: (2025)
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
by: Bansal, Hritik, et al.
Published: (2024)
by: Bansal, Hritik, et al.
Published: (2024)
Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback
by: Furuta, Hiroki, et al.
Published: (2024)
by: Furuta, Hiroki, et al.
Published: (2024)
M4V: Multi-Modal Mamba for Text-to-Video Generation
by: Huang, Jiancheng, et al.
Published: (2025)
by: Huang, Jiancheng, et al.
Published: (2025)
Enhancing Diffusion Models for High-Quality Image Generation
by: Shah, Jaineet, et al.
Published: (2024)
by: Shah, Jaineet, et al.
Published: (2024)
Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation
by: Shenaj, Donald, et al.
Published: (2026)
by: Shenaj, Donald, et al.
Published: (2026)
TextCraftor: Your Text Encoder Can be Image Quality Controller
by: Li, Yanyu, et al.
Published: (2024)
by: Li, Yanyu, et al.
Published: (2024)
LayeredDoc: Domain Adaptive Document Restoration with a Layer Separation Approach
by: Pilligua, Maria, et al.
Published: (2024)
by: Pilligua, Maria, et al.
Published: (2024)
STIV: Scalable Text and Image Conditioned Video Generation
by: Lin, Zongyu, et al.
Published: (2024)
by: Lin, Zongyu, et al.
Published: (2024)
SADA: Stability-guided Adaptive Diffusion Acceleration
by: Jiang, Ting, et al.
Published: (2025)
by: Jiang, Ting, et al.
Published: (2025)
Accelerating Vision Transformers with Adaptive Patch Sizes
by: Choudhury, Rohan, et al.
Published: (2025)
by: Choudhury, Rohan, et al.
Published: (2025)
GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers
by: Song, Zhiye, et al.
Published: (2025)
by: Song, Zhiye, et al.
Published: (2025)
GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
by: Mou, Zhun, et al.
Published: (2025)
by: Mou, Zhun, et al.
Published: (2025)
MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation
by: Asali, Ehsan, et al.
Published: (2023)
by: Asali, Ehsan, et al.
Published: (2023)
Identifying Bias in Deep Neural Networks Using Image Transforms
by: Erukude, Sai Teja, et al.
Published: (2024)
by: Erukude, Sai Teja, et al.
Published: (2024)
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
by: Girdhar, Rohit, et al.
Published: (2023)
by: Girdhar, Rohit, et al.
Published: (2023)
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient Generative Inference
by: Adnan, Muhammad, et al.
Published: (2024)
by: Adnan, Muhammad, et al.
Published: (2024)
Accelerating Video Inverse Problem Solvers with Autoregressive Diffusion Models
by: Kwon, Taesung, et al.
Published: (2026)
by: Kwon, Taesung, et al.
Published: (2026)
Adaptive Keyframe Sampling for Long Video Understanding
by: Tang, Xi, et al.
Published: (2025)
by: Tang, Xi, et al.
Published: (2025)
From Text to Pose to Image: Improving Diffusion Model Control and Quality
by: Bonnet, Clément, et al.
Published: (2024)
by: Bonnet, Clément, et al.
Published: (2024)
HiFA: High-fidelity Text-to-3D Generation with Advanced Diffusion Guidance
by: Zhu, Junzhe, et al.
Published: (2023)
by: Zhu, Junzhe, et al.
Published: (2023)
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
by: Zhang, Jintao, et al.
Published: (2025)
by: Zhang, Jintao, et al.
Published: (2025)
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
by: Sinha, Sankalp, et al.
Published: (2024)
by: Sinha, Sankalp, et al.
Published: (2024)
Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization
by: Alahmadi, Muhammad J., et al.
Published: (2026)
by: Alahmadi, Muhammad J., et al.
Published: (2026)
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
by: Melas-Kyriazi, Luke, et al.
Published: (2024)
by: Melas-Kyriazi, Luke, et al.
Published: (2024)
Denoising Score Distillation: From Noisy Diffusion Pretraining to One-Step High-Quality Generation
by: Chen, Tianyu, et al.
Published: (2025)
by: Chen, Tianyu, et al.
Published: (2025)
Hierarchical Active Inference using Successor Representations
by: Rangarajan, Prashant, et al.
Published: (2026)
by: Rangarajan, Prashant, et al.
Published: (2026)
TempoControl: Temporal Attention Guidance for Text-to-Video Models
by: Schiber, Shira, et al.
Published: (2025)
by: Schiber, Shira, et al.
Published: (2025)
Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
by: You, Haoran, et al.
Published: (2024)
by: You, Haoran, et al.
Published: (2024)
LARV: Data-Free Layer-wise Adaptive Rescaling Veneer for Model Merging
by: Wang, Xinyu, et al.
Published: (2026)
by: Wang, Xinyu, et al.
Published: (2026)
Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling
by: Seo, Minhyuk, et al.
Published: (2024)
by: Seo, Minhyuk, et al.
Published: (2024)
Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
by: Jeong, Hyeonho, et al.
Published: (2023)
by: Jeong, Hyeonho, et al.
Published: (2023)
EUGens: Efficient, Unified, and General Dense Layers
by: Kim, Sang Min, et al.
Published: (2024)
by: Kim, Sang Min, et al.
Published: (2024)
Similar Items
-
Hybrid Quantum-Classical Model for Image Classification
by: Shahzad, Muhammad Adnan
Published: (2025) -
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
by: Yoon, Jaehong, et al.
Published: (2024) -
DAUNet: A Lightweight UNet Variant with Deformable Convolutions and Parameter-Free Attention for Medical Image Segmentation
by: Munir, Adnan, et al.
Published: (2025) -
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
by: Seo, Hoigi, et al.
Published: (2025) -
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC
by: Du, Yilun, et al.
Published: (2023)