:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zakarin, Daniyar, Wandel, Thiemo, Obukhov, Anton, Dai, Dengxin
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2512.05000
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Complete Gaussian Splats from a Single Image with Denoising Diffusion Models
by: Liao, Ziwei, et al.
Published: (2025)

Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
by: Dong, Wei, et al.
Published: (2024)

HadaNorm: Diffusion Transformer Quantization through Mean-Centered Transformations
by: Federici, Marco, et al.
Published: (2025)

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways
by: Liu, Yi, et al.
Published: (2025)

Video Depth Propagation
by: Piccinelli, Luigi, et al.
Published: (2025)

OminiControl2: Efficient Conditioning for Diffusion Transformers
by: Tan, Zhenxiong, et al.
Published: (2025)

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers
by: Kim, Dahye, et al.
Published: (2026)

VMonarch: Efficient Video Diffusion Transformers with Structured Attention
by: Liang, Cheng, et al.
Published: (2026)

GalaxyDiT: Efficient Video Generation with Guidance Alignment and Adaptive Proxy in Diffusion Transformers
by: Song, Zhiye, et al.
Published: (2025)

Latent Feature-Guided Diffusion Models for Shadow Removal
by: Mei, Kangfu, et al.
Published: (2023)

Transcending Domains through Text-to-Image Diffusion: A Source-Free Approach to Domain Adaptation
by: Chopra, Shivang, et al.
Published: (2023)

DocShaDiffusion: Diffusion Model in Latent Space for Document Image Shadow Removal
by: Liu, Wenjie, et al.
Published: (2025)

Faster Inference of Integer SWIN Transformer by Removing the GELU Activation
by: Tayaranian, Mohammadreza, et al.
Published: (2024)

BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching
by: Cui, Hanshuai, et al.
Published: (2025)

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy
by: Yang, Yiting, et al.
Published: (2025)

JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers
by: Byung-Ki, Kwon, et al.
Published: (2025)

SIRR-LMM: Single-image Reflection Removal via Large Multimodal Model
by: Guo, Yu, et al.
Published: (2026)

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer
by: Chen, Junsong, et al.
Published: (2025)

HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization
by: Liu, Wenxuan, et al.
Published: (2024)

EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketching
by: Chen, Xinwang, et al.
Published: (2024)

Seeing through Unclear Glass: Occlusion Removal with One Shot
by: Li, Qiang, et al.
Published: (2025)

Temporal Aware Pruning for Efficient Diffusion-based Video Generation
by: Li, Sheng, et al.
Published: (2026)

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
by: Wu, Yiming, et al.
Published: (2025)

EdgeDiT: Hardware-Aware Diffusion Transformers for Efficient On-Device Image Generation
by: Kodavanti, Sravanth, et al.
Published: (2026)

Single Image Reflection Separation via Dual Prior Interaction Transformer
by: Huang, Yue, et al.
Published: (2025)

V-Reflection: Transforming MLLMs from Passive Observers to Active Interrogators
by: Zhou, Jiazhou, et al.
Published: (2026)

Memory-Efficient Fine-Tuning Diffusion Transformers via Dynamic Patch Sampling and Block Skipping
by: Park, Sunghyun, et al.
Published: (2026)

TransNeXt: Robust Foveal Visual Perception for Vision Transformers
by: Shi, Dai
Published: (2023)

Aligning Diffusion Models with Noise-Conditioned Perception
by: Gambashidze, Alexander, et al.
Published: (2024)

DDT: Decoupled Diffusion Transformer
by: Wang, Shuai, et al.
Published: (2025)

Feature Fusion Transferability Aware Transformer for Unsupervised Domain Adaptation
by: Yu, Xiaowei, et al.
Published: (2024)

SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion
by: Cho, Jungbin, et al.
Published: (2025)

EUDA: An Efficient Unsupervised Domain Adaptation via Self-Supervised Vision Transformer
by: Abedi, Ali, et al.
Published: (2024)

DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers
by: Zhang, Hanling, et al.
Published: (2025)

Scaling Diffusion Transformers Efficiently via $μ$P
by: Zheng, Chenyu, et al.
Published: (2025)

Near-Infrared and Low-Rank Adaptation of Vision Transformers in Remote Sensing
by: Ulku, Irem, et al.
Published: (2024)

Region-Adaptive Sampling for Diffusion Transformers
by: Liu, Ziming, et al.
Published: (2025)

SDiT: Spiking Diffusion Model with Transformer
by: Yang, Shu, et al.
Published: (2024)

StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model
by: Zhou, Ziyin, et al.
Published: (2024)

SPARE: Self-distillation for PARameter-Efficient Removal
by: Mola, Natnael, et al.
Published: (2026)