:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lou, Xinghua, Dave, Meet, Kushagra, Shrinu, Lazaro-Gredilla, Miguel, Murphy, Kevin
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.19635
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Fast Samplers for Inverse Problems in Iterative Refinement Models
by: Pandey, Kushagra, et al.
Published: (2024)

Neural USD: An object-centric framework for iterative editing and control
by: Escontrela, Alejandro, et al.
Published: (2025)

Symbolic Graphics Programming with Large Language Models
by: Chen, Yamei, et al.
Published: (2025)

Variational Control for Guidance in Diffusion Models
by: Pandey, Kushagra, et al.
Published: (2025)

SVGen: Interpretable Vector Graphics Generation with Large Language Models
by: Wang, Feiyu, et al.
Published: (2025)

Improving Transformer World Models for Data-Efficient RL
by: Dedieu, Antoine, et al.
Published: (2025)

Incremental Multi-Scene Modeling via Continual Neural Graphics Primitives
by: Singh, Prajwal, et al.
Published: (2024)

Towards a Mechanistic Explanation of Diffusion Model Generalization
by: Niedoba, Matthew, et al.
Published: (2024)

Programmatic Video Prediction Using Large Language Models
by: Tang, Hao, et al.
Published: (2025)

SVGFusion: A VAE-Diffusion Transformer for Vector Graphic Generation
by: Xing, Ximing, et al.
Published: (2024)

Fully Kolmogorov-Arnold Deep Model in Medical Image Segmentation
by: Qiu, Xingyu, et al.
Published: (2026)

Video Prediction of Dynamic Physical Simulations With Pixel-Space Spatiotemporal Transformers
by: Slack, Dean L, et al.
Published: (2025)

Transformer Model for Alzheimer's Disease Progression Prediction Using Longitudinal Visit Sequences
by: Moghaddami, Mahdi, et al.
Published: (2025)

Elucidating the Design Space of Arbitrary-Noise-Based Diffusion Models
by: Qiu, Xingyu, et al.
Published: (2025)

Multimodal Structure Learning: Disentangling Shared and Specific Topology via Cross-Modal Graphical Lasso
by: Wang, Fei, et al.
Published: (2026)

LAPA: Log-Domain Prediction-Driven Dynamic Sparsity Accelerator for Transformer Model
by: Wang, Huizheng, et al.
Published: (2025)

An Explainable Transformer Model for Alzheimer's Disease Detection Using Retinal Imaging
by: Jamshidiha, Saeed, et al.
Published: (2025)

Enhancing Robustness of Human Detection Algorithms in Maritime SAR through Augmented Aerial Images to Simulate Weather Conditions
by: Tjia, Miguel, et al.
Published: (2024)

Structured Generations: Using Hierarchical Clusters to guide Diffusion Models
by: Goncalves, Jorge da Silva, et al.
Published: (2024)

Geometric Trajectory Diffusion Models
by: Han, Jiaqi, et al.
Published: (2024)

Decoding Defensive Coverage Responsibilities in American Football Using Factorized Attention Based Transformer Models
by: Song, Kevin, et al.
Published: (2026)

Early Prediction of Type 2 Diabetes Using Multimodal data and Tabular Transformers
by: Khan, Sulaiman, et al.
Published: (2026)

Assessing Graphical Perception of Image Embedding Models using Channel Effectiveness
by: Lee, Soohyun, et al.
Published: (2024)

Direct Motion Models for Assessing Generated Videos
by: Allen, Kelsey, et al.
Published: (2025)

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
by: Zou, Bocheng, et al.
Published: (2024)

Fast and Efficient Transformer-based Method for Bird's Eye View Instance Prediction
by: Antunes-García, Miguel, et al.
Published: (2024)

Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training
by: Dong, Junhao, et al.
Published: (2024)

Directly Fine-Tuning Diffusion Models on Differentiable Rewards
by: Clark, Kevin, et al.
Published: (2023)

Can Large Language Models Understand Symbolic Graphics Programs?
by: Qiu, Zeju, et al.
Published: (2024)

Control-Augmented Autoregressive Diffusion for Data Assimilation
by: Srivastava, Prakhar, et al.
Published: (2025)

Token Caching for Diffusion Transformer Acceleration
by: Lou, Jinming, et al.
Published: (2024)

A Transformer-based Multimodal Fusion Model for Efficient Crowd Counting Using Visual and Wireless Signals
by: Cui, Zhe, et al.
Published: (2025)

Context-Aware Zero-Shot Anomaly Detection in Surveillance Using Contrastive and Predictive Spatiotemporal Modeling
by: Khan, Md. Rashid Shahriar, et al.
Published: (2025)

Learning Transformer-based World Models with Contrastive Predictive Coding
by: Burchi, Maxime, et al.
Published: (2025)

Hierarchical Variational Policies for Reward-Guided Diffusion
by: Pandey, Kushagra, et al.
Published: (2026)

Structure Preserving Diffusion Models
by: Lu, Haoye, et al.
Published: (2024)

Slot Structured World Models
by: Collu, Jonathan, et al.
Published: (2024)

A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos
by: Ozdel, Suleyman, et al.
Published: (2024)

TerDiT: Ternary Diffusion Models with Transformers
by: Lu, Xudong, et al.
Published: (2024)

Graphic-Design-Bench: A Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks
by: Deganutti, Adrienne, et al.
Published: (2026)