:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Kai, Wang, Ruohui, Gao, Jianfei, Chen, Kai
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2405.07194
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes
by: Shu, Kai, et al.
Published: (2024)

Linear Differential Vision Transformer: Learning Visual Contrasts via Pairwise Differentials
by: Pu, Yifan, et al.
Published: (2025)

Joint PET-MRI Reconstruction with Diffusion Stochastic Differential Model
by: Xie, Taofeng, et al.
Published: (2024)

DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces
by: Khan, Mohammad Sadil, et al.
Published: (2026)

RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy
by: Zhao, Zhonghan, et al.
Published: (2025)

An Ordinary Differential Equation Sampler with Stochastic Start for Diffusion Bridge Models
by: Wang, Yuang, et al.
Published: (2024)

Differentiable Prompt Learning for Vision Language Models
by: Huang, Zhenhan, et al.
Published: (2024)

Visual Generation Without Guidance
by: Chen, Huayu, et al.
Published: (2025)

Privacy-Preserving Model Transcription with Differentially Private Synthetic Distillation
by: Liu, Bochao, et al.
Published: (2026)

Modeling Retinal Ganglion Cells with Neural Differential Equations
by: Dobek, Kacper, et al.
Published: (2025)

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times
by: Zhang, Jintao, et al.
Published: (2025)

ELV-Halluc: Benchmarking Semantic Aggregation Hallucinations in Long Video Understanding
by: Lu, Hao, et al.
Published: (2025)

Differentially Private Fine-Tuning of Diffusion Models
by: Tsai, Yu-Lin, et al.
Published: (2024)

A Multimodal Benchmark Dataset and Model for Crop Disease Diagnosis
by: Liu, Xiang, et al.
Published: (2025)

ScaleNet: Scaling up Pretrained Neural Networks with Incremental Parameters
by: Hao, Zhiwei, et al.
Published: (2025)

A Forward and Backward Compatible Framework for Few-shot Class-incremental Pill Recognition
by: Zhang, Jinghua, et al.
Published: (2023)

Gradient-Guided Conditional Diffusion Models for Private Image Reconstruction: Analyzing Adversarial Impacts of Differential Privacy and Denoising
by: Huang, Tao, et al.
Published: (2024)

LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models
by: Sun, Fan-Yun, et al.
Published: (2024)

CODE: Confident Ordinary Differential Editing
by: van Delft, Bastien, et al.
Published: (2024)

DiG: Differential Grounding for Enhancing Fine-Grained Perception in Multimodal Large Language Model
by: Tao, Zhou, et al.
Published: (2025)

A Large Vision-Language Model based Environment Perception System for Visually Impaired People
by: Chen, Zezhou, et al.
Published: (2025)

Flexible Physical Camouflage Generation Based on a Differential Approach
by: Li, Yang, et al.
Published: (2024)

Evaluating Adversarial Protections for Diffusion Personalization: A Comprehensive Study
by: Ye, Kai, et al.
Published: (2025)

VAGS: Velocity Adaptive Guidance Scale for Image Editing and Generation
by: Luo, Yan, et al.
Published: (2026)

A Survey on Vision Autoregressive Model
by: Jiang, Kai, et al.
Published: (2024)

On the Promise for Assurance of Differentiable Neurosymbolic Reasoning Paradigms
by: Richards, Luke E., et al.
Published: (2025)

Inference-Time Scaling of Diffusion Models for Infrared Data Generation
by: Horstmann, Kai A., et al.
Published: (2025)

Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
by: Meng, Ke, et al.
Published: (2024)

Large-Scale Universal Defect Generation: Foundation Models and Datasets
by: Fan, Yuanting, et al.
Published: (2026)

Scale-Aware Curriculum Learning for Ddata-Efficient Lung Nodule Detection with YOLOv11
by: Luo, Yi, et al.
Published: (2025)

MITS: A Large-Scale Multimodal Benchmark Dataset for Intelligent Traffic Surveillance
by: Zhao, Kaikai, et al.
Published: (2025)

AnyTrans: Translate AnyText in the Image with Large Scale Models
by: Qian, Zhipeng, et al.
Published: (2024)

Exclusive Style Removal for Cross Domain Novel Class Discovery
by: Wang, Yicheng, et al.
Published: (2024)

TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward
by: Luo, Yihong, et al.
Published: (2026)

Adaptive Channel Allocation for Robust Differentiable Architecture Search
by: Li, Chao, et al.
Published: (2022)

Physical Prompt Injection Attacks on Large Vision-Language Models
by: Ling, Chen, et al.
Published: (2026)

DMTG: One-Shot Differentiable Multi-Task Grouping
by: Gao, Yuan, et al.
Published: (2024)

LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
by: Gao, Huanlin, et al.
Published: (2025)

FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion
by: Chen, Zhuokun, et al.
Published: (2026)

BiomedAP: A Vision-Informed Dual-Anchor Framework with Gated Cross-Modal Fusion for Robust Medical Vision-Language Adaptation
by: Tong, Huanyang, et al.
Published: (2026)