:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shaulov, Ariel, Shaar, Eitan, Edenzon, Amit, Chechik, Gal, Wolf, Lior
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2605.14988
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TokenTrim: Inference-Time Token Pruning for Autoregressive Long Video Generation
by: Shaulov, Ariel, et al.
Published: (2026)

Adapting to the Unknown: Training-Free Audio-Visual Event Perception with Dynamic Thresholds
by: Shaar, Eitan, et al.
Published: (2025)

Latent Transfer Attack: Adversarial Examples via Generative Latent Spaces
by: Shaar, Eitan, et al.
Published: (2026)

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation
by: Shaulov, Ariel, et al.
Published: (2025)

Classifier-Guided Captioning Across Modalities
by: Shaulov, Ariel, et al.
Published: (2025)

Data-Driven Loss Functions for Inference-Time Optimization in Text-to-Image
by: Yiflach, Sapir Esther, et al.
Published: (2025)

Training-Free Consistent Text-to-Image Generation
by: Tewel, Yoad, et al.
Published: (2024)

Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
by: Tewel, Yoad, et al.
Published: (2024)

Motion by Queries: Identity-Motion Trade-offs in Text-to-Video Generation
by: Atzmon, Yuval, et al.
Published: (2024)

IlluSign: Illustrating Sign Language Videos by Leveraging the Attention Mechanism
by: Bruner, Janna, et al.
Published: (2025)

ComfyGen: Prompt-Adaptive Workflows for Text-to-Image Generation
by: Gal, Rinon, et al.
Published: (2024)

Single Image Iterative Subject-driven Generation and Editing
by: Shpitzer, Yair, et al.
Published: (2025)

LCM-Lookahead for Encoder-based Text-to-Image Personalization
by: Gal, Rinon, et al.
Published: (2024)

Fast 4D Mesh Generation by Spatio-Temporal Attention Chains
by: Samuel, Dvir, et al.
Published: (2026)

Per-Query Visual Concept Learning
by: Malca, Ori, et al.
Published: (2025)

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models
by: Chefer, Hila, et al.
Published: (2025)

Assessing Image Quality Using a Simple Generative Representation
by: Raviv, Simon, et al.
Published: (2024)

OmnimatteZero: Fast Training-free Omnimatte with Pre-trained Video Diffusion Models
by: Samuel, Dvir, et al.
Published: (2025)

Still-Moving: Customized Video Generation without Customized Video Data
by: Chefer, Hila, et al.
Published: (2024)

Key-Locked Rank One Editing for Text-to-Image Personalization
by: Tewel, Yoad, et al.
Published: (2023)

TriTex: Learning Texture from a Single Mesh via Triplane Semantic Features
by: Cohen-Bar, Dana, et al.
Published: (2025)

Bringing Objects to Life: training-free 4D generation from 3D objects through view consistent noise
by: Rahamim, Ohad, et al.
Published: (2024)

IT$^3$: Idempotent Test-Time Training
by: Durasov, Nikita, et al.
Published: (2024)

Policy Optimized Text-to-Image Pipeline Design
by: Gadot, Uri, et al.
Published: (2025)

ConsiStyle: Style Diversity in Training-Free Consistent T2I Generation
by: Mazuz, Yohai, et al.
Published: (2025)

Efficient Verification-Based Face Identification
by: Rozner, Amit, et al.
Published: (2023)

Domain-Generalizable Multiple-Domain Clustering
by: Rozner, Amit, et al.
Published: (2023)

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception
by: Benny, Yaniv, et al.
Published: (2024)

Diffusion-Based Attention Warping for Consistent 3D Scene Editing
by: Gomel, Eyal, et al.
Published: (2024)

MultiAct: Text-to-Motion Generation from Composite Text via Tailored Attention Guidance
by: Sala, Nathan, et al.
Published: (2026)

Obtaining Favorable Layouts for Multiple Object Generation
by: Battash, Barak, et al.
Published: (2024)

Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
by: Samuel, Dvir, et al.
Published: (2026)

MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark
by: Shaar, Shaden, et al.
Published: (2026)

Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models
by: Toker, Michael, et al.
Published: (2025)

RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression
by: Gadot, Uri, et al.
Published: (2025)

Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors
by: Rahamim, Ohad, et al.
Published: (2024)

Make It Count: Text-to-Image Generation with an Accurate Number of Objects
by: Binyamin, Lital, et al.
Published: (2024)

A Meaningful Perturbation Metric for Evaluating Explainability Methods
by: Cohen, Danielle, et al.
Published: (2025)

Linguistic Binding in Diffusion Models: Enhancing Attribute Correspondence through Attention Map Alignment
by: Rassin, Royi, et al.
Published: (2023)

Guardians of Generation: Dynamic Inference-Time Copyright Shielding with Adaptive Guidance for AI Image Generation
by: Roy, Soham, et al.
Published: (2025)