:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Peruzzo, Elia, Sautière, Guillaume, Habibian, Amirhossein
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.05149
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Clockwork Diffusion: Efficient Generation With Model-Step Distillation
by: Habibian, Amirhossein, et al.
Published: (2023)

ReHyAt: Recurrent Hybrid Attention for Video Diffusion Transformers
by: Ghafoorian, Mohsen, et al.
Published: (2026)

Attention Surgery: An Efficient Recipe to Linearize Your Video Diffusion Transformer
by: Ghafoorian, Mohsen, et al.
Published: (2025)

Enhancing Novel View Synthesis via Geometry Grounded Set Diffusion
by: Zanjani, Farhad G., et al.
Published: (2026)

Scene-Aware Location Modeling for Data Augmentation in Automotive Object Detection
by: Petersen, Jens, et al.
Published: (2025)

Grouped Speculative Decoding for Autoregressive Image Generation
by: So, Junhyuk, et al.
Published: (2025)

Continuous Speculative Decoding for Autoregressive Image Generation
by: Wang, Zili, et al.
Published: (2024)

Imagining the Unseen: Generative Location Modeling for Object Placement
by: Yun, Jooyeol, et al.
Published: (2024)

Gated Relational Alignment via Confidence-based Distillation for Efficient VLMs
by: Chen, Yanlong, et al.
Published: (2026)

PyramidalWan: On Making Pretrained Video Model Pyramidal for Efficient Inference
by: Korzhenkov, Denis, et al.
Published: (2026)

Hybrid Gaussian Splatting for Novel Urban View Synthesis
by: Omran, Mohamed, et al.
Published: (2025)

Controllable 3D Placement of Objects with Scene-Aware Diffusion Models
by: Omran, Mohamed, et al.
Published: (2025)

Low-Latency Neural Stereo Streaming
by: Hou, Qiqi, et al.
Published: (2024)

GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models
by: D'Incà, Moreno, et al.
Published: (2024)

RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
by: Peruzzo, Elia, et al.
Published: (2025)

Gaussian Splatting is an Effective Data Generator for 3D Object Detection
by: Zanjani, Farhad G., et al.
Published: (2025)

MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models
by: Bhowmik, Aritra, et al.
Published: (2025)

MoViE: Mobile Diffusion for Video Editing
by: Karjauv, Adil, et al.
Published: (2024)

SJD-PV: Speculative Jacobi Decoding with Phrase Verification for Autoregressive Image Generation
by: Yu, Zhehao, et al.
Published: (2026)

SJD-VP: Speculative Jacobi Decoding with Verification Prediction for Autoregressive Image Generation
by: Shan, Bingqi, et al.
Published: (2026)

Mobile Video Diffusion
by: Yahia, Haitam Ben, et al.
Published: (2024)

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
by: Teng, Yao, et al.
Published: (2024)

Speculative Decoding for Autoregressive Video Generation
by: Hu, Yuezhou, et al.
Published: (2026)

Annealed Relaxation of Speculative Decoding for Faster Autoregressive Image Generation
by: Li, Xingyao, et al.
Published: (2026)

CASCADE: Context-Aware Relaxation for Speculative Image Decoding
by: Yildirim, Selin, et al.
Published: (2026)

Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
by: Teng, Yao, et al.
Published: (2025)

Object-Centric Diffusion for Efficient Video Editing
by: Kahatapitiya, Kumara, et al.
Published: (2024)

SJD++: Improved Speculative Jacobi Decoding for Training-free Acceleration of Discrete Auto-regressive Text-to-Image Generation
by: Teng, Yao, et al.
Published: (2025)

Speculative Coupled Decoding for Training-Free Lossless Acceleration of Autoregressive Visual Generation
by: So, Junhyuk, et al.
Published: (2025)

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models
by: D'Incà, Moreno, et al.
Published: (2024)

MMSpec: Benchmarking Speculative Decoding for Vision-Language Models
by: Shen, Hui, et al.
Published: (2026)

Neodragon: Mobile Video Generation using Diffusion Transformer
by: Karnewar, Animesh, et al.
Published: (2025)

Multi-Scale Deep Learning for Colon Histopathology: A Hybrid Graph-Transformer Approach
by: Saremi, Sadra, et al.
Published: (2025)

Speculative Decoding Reimagined for Multimodal Large Language Models
by: Lin, Luxi, et al.
Published: (2025)

VASE: Object-Centric Appearance and Shape Manipulation of Real Videos
by: Peruzzo, Elia, et al.
Published: (2024)

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation
by: Zhang, Zhuoyang, et al.
Published: (2025)

VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping
by: Dong, Haotian, et al.
Published: (2025)

Safe Vision-Language Models via Unsafe Weights Manipulation
by: D'Incà, Moreno, et al.
Published: (2025)

XSpecMesh: Quality-Preserving Auto-Regressive Mesh Generation Acceleration via Multi-Head Speculative Decoding
by: Chen, Dian, et al.
Published: (2025)

SpecFLASH: A Latent-Guided Semi-autoregressive Speculative Decoding Framework for Efficient Multimodal Generation
by: Wang, Zihua, et al.
Published: (2025)