:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jain, Anubhav, Kobayashi, Yuya, Shibuya, Takashi, Takida, Yuhta, Memon, Nasir, Togelius, Julian, Mitsufuji, Yuki
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2411.16738
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TraSCE: Trajectory Steering for Concept Erasure
by: Jain, Anubhav, et al.
Published: (2024)

Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image
by: Jain, Anubhav, et al.
Published: (2025)

Efficiency without Compromise: CLIP-aided Text-to-Image GANs with Increased Diversity
by: Kobayashi, Yuya, et al.
Published: (2025)

MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
by: Uchida, Kengo, et al.
Published: (2024)

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
by: Shibuya, Takashi, et al.
Published: (2023)

Alpha-wolves and Alpha-mammals: Exploring Dictionary Attacks on Iris Recognition Systems
by: Banerjee, Sudipta, et al.
Published: (2023)

Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation
by: Uppal, Anshuk, et al.
Published: (2025)

VCT: Training Consistency Models with Variational Noise Coupling
by: Silvestri, Gianluigi, et al.
Published: (2025)

SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator
by: Takida, Yuhta, et al.
Published: (2025)

Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance
by: Hayakawa, Akio, et al.
Published: (2025)

PAVAS: Physics-Aware Video-to-Audio Synthesis
by: Hyun-Bin, Oh, et al.
Published: (2025)

FaceCloak: Learning to Protect Face Templates
by: Banerjee, Sudipta, et al.
Published: (2025)

$\textit{Jump Your Steps}$: Optimizing Sampling Schedule of Discrete Diffusion Models
by: Park, Yong-Hyun, et al.
Published: (2024)

Dyadic Mamba: Long-term Dyadic Human Motion Synthesis
by: Tanke, Julian, et al.
Published: (2025)

Improving Classifier-Free Guidance in Masked Diffusion: Low-Dim Theoretical Insights with High-Dim Impact
by: Rojas, Kevin, et al.
Published: (2025)

Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment
by: Nguyen, Bac, et al.
Published: (2026)

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes
by: Takida, Yuhta, et al.
Published: (2023)

G2D2: Gradient-Guided Discrete Diffusion for Inverse Problem Solving
by: Murata, Naoki, et al.
Published: (2024)

MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
by: Hayakawa, Akio, et al.
Published: (2024)

A Unified View of Score-Based and Drifting Models
by: Lai, Chieh-Hsin, et al.
Published: (2026)

PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher
by: Kim, Dongjun, et al.
Published: (2024)

HumanGif: Single-View Human Diffusion with Generative Prior
by: Hu, Shoukang, et al.
Published: (2025)

SAN: Inducing Metrizability of GAN with Discriminative Normalized Linear Layer
by: Takida, Yuhta, et al.
Published: (2023)

SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation
by: Saito, Koichi, et al.
Published: (2024)

Noise Scheduling as Information-Guided Allocation in Diffusion Training
by: Raya, Gabriel, et al.
Published: (2026)

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion
by: Kim, Dongjun, et al.
Published: (2023)

CCStereo: Audio-Visual Contextual and Contrastive Learning for Binaural Audio Generation
by: Chen, Yuanhong, et al.
Published: (2025)

TITAN-Guide: Taming Inference-Time AligNment for Guided Text-to-Video Diffusion Models
by: Simon, Christian, et al.
Published: (2025)

AutoRefiner: Improving Autoregressive Video Diffusion Models via Reflective Refinement Over the Stochastic Sampling Path
by: Yu, Zhengyang, et al.
Published: (2025)

Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
by: Park, Yonghyun, et al.
Published: (2025)

MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
by: Cheng, Ho Kei, et al.
Published: (2024)

StereoSync: Spatially-Aware Stereo Audio Generation from Video
by: Marinoni, Christian, et al.
Published: (2025)

Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models
by: Tao, Zerui, et al.
Published: (2025)

Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion
by: Hayakawa, Satoshi, et al.
Published: (2025)

Distillation of Discrete Diffusion through Dimensional Correlations
by: Hayakawa, Satoshi, et al.
Published: (2024)

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
by: Yang, Shiqi, et al.
Published: (2024)

Factored Classifier-Free Guidance
by: Xia, Tian, et al.
Published: (2025)

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
by: Seo, Junyoung, et al.
Published: (2024)

Distill, Forget, Repeat: A Framework for Continual Unlearning in Text-to-Image Diffusion Models
by: George, Naveen, et al.
Published: (2025)

Towards Understanding the Mechanisms of Classifier-Free Guidance
by: Li, Xiang, et al.
Published: (2025)