Saved in:
| Main Authors: | Zhu, William Yicheng, Ye, Keren, Ke, Junjie, Yu, Jiahui, Guibas, Leonidas, Milanfar, Peyman, Yang, Feng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.04102 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
by: Huang, Ian, et al.
Published: (2024)
by: Huang, Ian, et al.
Published: (2024)
Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration
by: Delbracio, Mauricio, et al.
Published: (2023)
by: Delbracio, Mauricio, et al.
Published: (2023)
Denoising: A Powerful Building-Block for Imaging, Inverse Problems, and Machine Learning
by: Milanfar, Peyman, et al.
Published: (2024)
by: Milanfar, Peyman, et al.
Published: (2024)
UniRes: Universal Image Restoration for Complex Degradations
by: Zhou, Mo, et al.
Published: (2025)
by: Zhou, Mo, et al.
Published: (2025)
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
by: Ye, Keren, et al.
Published: (2025)
by: Ye, Keren, et al.
Published: (2025)
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities
by: Chen, Boyuan, et al.
Published: (2024)
by: Chen, Boyuan, et al.
Published: (2024)
SPIRE: Semantic Prompt-Driven Image Restoration
by: Qi, Chenyang, et al.
Published: (2023)
by: Qi, Chenyang, et al.
Published: (2023)
VLM-PAR: A Vision Language Model for Pedestrian Attribute Recognition
by: Sellam, Abdellah Zakaria, et al.
Published: (2025)
by: Sellam, Abdellah Zakaria, et al.
Published: (2025)
The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning
by: Sahraee-Ardakan, Mojtaba, et al.
Published: (2026)
by: Sahraee-Ardakan, Mojtaba, et al.
Published: (2026)
Reference-Guided Identity Preserving Face Restoration
by: Zhou, Mo, et al.
Published: (2025)
by: Zhou, Mo, et al.
Published: (2025)
MoMaps: Semantics-Aware Scene Motion Generation with Motion Maps
by: Lei, Jiahui, et al.
Published: (2025)
by: Lei, Jiahui, et al.
Published: (2025)
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds
by: Lei, Jiahui, et al.
Published: (2024)
by: Lei, Jiahui, et al.
Published: (2024)
SceneTeract: Agentic Functional Affordances and VLM Grounding in 3D Scenes
by: Maillard, Léopold, et al.
Published: (2026)
by: Maillard, Léopold, et al.
Published: (2026)
On the Relation Between Linear Diffusion and Power Iteration
by: Weitzner, Dana, et al.
Published: (2024)
by: Weitzner, Dana, et al.
Published: (2024)
RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
by: Kuang, Yuxuan, et al.
Published: (2024)
by: Kuang, Yuxuan, et al.
Published: (2024)
InfoGaussian: Structure-Aware Dynamic Gaussians through Lightweight Information Shaping
by: Zhang, Yunchao, et al.
Published: (2024)
by: Zhang, Yunchao, et al.
Published: (2024)
PASTA: Controllable Part-Aware Shape Generation with Autoregressive Transformers
by: Li, Songlin, et al.
Published: (2024)
by: Li, Songlin, et al.
Published: (2024)
High Perceptual Quality Image Denoising with a Posterior Sampling CGAN
by: Ohayon, Guy, et al.
Published: (2021)
by: Ohayon, Guy, et al.
Published: (2021)
OCH3R: Object-Centric Holistic 3D Reconstruction
by: Du, Yi, et al.
Published: (2026)
by: Du, Yi, et al.
Published: (2026)
PhysMem: Scaling Test-Time Memory for Embodied Physical Reasoning
by: Li, Haoyang, et al.
Published: (2026)
by: Li, Haoyang, et al.
Published: (2026)
Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation
by: Lee, Phillip Y., et al.
Published: (2025)
by: Lee, Phillip Y., et al.
Published: (2025)
ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field
by: Nakayama, Kiyohiro, et al.
Published: (2024)
by: Nakayama, Kiyohiro, et al.
Published: (2024)
Support-Set Context Matters for Bongard Problems
by: Raghuraman, Nikhil, et al.
Published: (2023)
by: Raghuraman, Nikhil, et al.
Published: (2023)
Dynamic Reflections: Probing Video Representations with Text Alignment
by: Zhu, Tyler, et al.
Published: (2025)
by: Zhu, Tyler, et al.
Published: (2025)
Zero-Shot Image Feature Consensus with Deep Functional Maps
by: Cheng, Xinle, et al.
Published: (2024)
by: Cheng, Xinle, et al.
Published: (2024)
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models
by: Tian, Xiaoyu, et al.
Published: (2024)
by: Tian, Xiaoyu, et al.
Published: (2024)
Refining Pre-Trained Motion Models
by: Sun, Xinglong, et al.
Published: (2024)
by: Sun, Xinglong, et al.
Published: (2024)
Stochastic Deep Restoration Priors for Imaging Inverse Problems
by: Hu, Yuyang, et al.
Published: (2024)
by: Hu, Yuyang, et al.
Published: (2024)
Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion
by: Lin, Junru, et al.
Published: (2025)
by: Lin, Junru, et al.
Published: (2025)
NeRF Revisited: Fixing Quadrature Instability in Volume Rendering
by: Uy, Mikaela Angelina, et al.
Published: (2023)
by: Uy, Mikaela Angelina, et al.
Published: (2023)
Attribute-based Visual Reprogramming for Vision-Language Models
by: Cai, Chengyi, et al.
Published: (2025)
by: Cai, Chengyi, et al.
Published: (2025)
BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing
by: Gu, Yunqi, et al.
Published: (2025)
by: Gu, Yunqi, et al.
Published: (2025)
Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning
by: You, Yang, et al.
Published: (2024)
by: You, Yang, et al.
Published: (2024)
Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization
by: You, Yang, et al.
Published: (2024)
by: You, Yang, et al.
Published: (2024)
Enhancing Remote Sensing Vision-Language Models Through MLLM and LLM-Based High-Quality Image-Text Dataset Generation
by: He, Yiguo, et al.
Published: (2025)
by: He, Yiguo, et al.
Published: (2025)
Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks
by: Fan, Yucheng, et al.
Published: (2025)
by: Fan, Yucheng, et al.
Published: (2025)
VideoLifter: Lifting Videos to 3D with Fast Hierarchical Stereo Alignment
by: Cong, Wenyan, et al.
Published: (2025)
by: Cong, Wenyan, et al.
Published: (2025)
Bigger is not Always Better: Scaling Properties of Latent Diffusion Models
by: Mei, Kangfu, et al.
Published: (2024)
by: Mei, Kangfu, et al.
Published: (2024)
The Power of Context: How Multimodality Improves Image Super-Resolution
by: Mei, Kangfu, et al.
Published: (2025)
by: Mei, Kangfu, et al.
Published: (2025)
Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration
by: Hu, Yuyang, et al.
Published: (2025)
by: Hu, Yuyang, et al.
Published: (2025)
Similar Items
-
BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
by: Huang, Ian, et al.
Published: (2024) -
Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration
by: Delbracio, Mauricio, et al.
Published: (2023) -
Denoising: A Powerful Building-Block for Imaging, Inverse Problems, and Machine Learning
by: Milanfar, Peyman, et al.
Published: (2024) -
UniRes: Universal Image Restoration for Complex Degradations
by: Zhou, Mo, et al.
Published: (2025) -
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance
by: Ye, Keren, et al.
Published: (2025)