Saved in:
| Main Authors: | Li, Yuming, Jia, Peidong, Hong, Daiwei, Jia, Yueru, She, Qi, Zhao, Rui, Lu, Ming, Zhang, Shanghang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.06163 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025)
by: Bai, Chengyu, et al.
Published: (2025)
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
by: Li, Yuming, et al.
Published: (2025)
by: Li, Yuming, et al.
Published: (2025)
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
by: Jia, Yueru, et al.
Published: (2024)
by: Jia, Yueru, et al.
Published: (2024)
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
by: Liu, Jiaming, et al.
Published: (2023)
by: Liu, Jiaming, et al.
Published: (2023)
World-Ego Modeling for Long-Horizon Evolution in Hybrid Embodied Tasks
by: Lin, Zuyao, et al.
Published: (2026)
by: Lin, Zuyao, et al.
Published: (2026)
ChainV: Atomic Visual Hints Make Multimodal Reasoning Shorter and Better
by: Zhang, Yuan, et al.
Published: (2025)
by: Zhang, Yuan, et al.
Published: (2025)
UniEdit-I: Training-free Image Editing for Unified VLM via Iterative Understanding, Editing and Verifying
by: Bai, Chengyu, et al.
Published: (2025)
by: Bai, Chengyu, et al.
Published: (2025)
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
by: Li, Muyang, et al.
Published: (2024)
by: Li, Muyang, et al.
Published: (2024)
Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs
by: Zhang, Qizhe, et al.
Published: (2025)
by: Zhang, Qizhe, et al.
Published: (2025)
AutoV: Loss-Oriented Ranking for Visual Prompt Retrieval in LVLMs
by: Zhang, Yuan, et al.
Published: (2025)
by: Zhang, Yuan, et al.
Published: (2025)
ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance
by: Shi, Shuwei, et al.
Published: (2024)
by: Shi, Shuwei, et al.
Published: (2024)
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
by: Jia, Yueru, et al.
Published: (2024)
by: Jia, Yueru, et al.
Published: (2024)
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
by: Zhang, Qizhe, et al.
Published: (2024)
by: Zhang, Qizhe, et al.
Published: (2024)
DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance
by: Kim, Younghyun, et al.
Published: (2024)
by: Kim, Younghyun, et al.
Published: (2024)
FastDriveVLA: Efficient End-to-End Driving via Plug-and-Play Reconstruction-based Token Pruning
by: Cao, Jiajun, et al.
Published: (2025)
by: Cao, Jiajun, et al.
Published: (2025)
COLE: A Hierarchical Generation Framework for Multi-Layered and Editable Graphic Design
by: Jia, Peidong, et al.
Published: (2023)
by: Jia, Peidong, et al.
Published: (2023)
TimeSearch: Hierarchical Video Search with Spotlight and Reflection for Human-like Long Video Understanding
by: Pan, Junwen, et al.
Published: (2025)
by: Pan, Junwen, et al.
Published: (2025)
Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation
by: Zhang, Xiaoyu, et al.
Published: (2024)
by: Zhang, Xiaoyu, et al.
Published: (2024)
RustNeRF: Robust Neural Radiance Field with Low-Quality Images
by: Li, Mengfei, et al.
Published: (2024)
by: Li, Mengfei, et al.
Published: (2024)
Diffusion-based Facial Aesthetics Enhancement with 3D Structure Guidance
by: Li, Lisha, et al.
Published: (2025)
by: Li, Lisha, et al.
Published: (2025)
EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting
by: Wei, Xiaobao, et al.
Published: (2024)
by: Wei, Xiaobao, et al.
Published: (2024)
Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction
by: Zhou, Changqing, et al.
Published: (2026)
by: Zhou, Changqing, et al.
Published: (2026)
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
by: Chan, Kelvin C. K., et al.
Published: (2024)
by: Chan, Kelvin C. K., et al.
Published: (2024)
Structural Energy Guidance for View-Consistent Text-to-3D Generation
by: Zhang, Qing, et al.
Published: (2026)
by: Zhang, Qing, et al.
Published: (2026)
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance
by: Bu, Jiazi, et al.
Published: (2025)
by: Bu, Jiazi, et al.
Published: (2025)
From Extrinsic to Intrinsic: Geodesic-Guided Representation Learning for 3D Geometric Data
by: Zhao, Yuming, et al.
Published: (2026)
by: Zhao, Yuming, et al.
Published: (2026)
SRFlow: A Dataset and Regularization Model for High-Resolution Facial Optical Flow via Splatting Rasterization
by: Zhang, JiaLin, et al.
Published: (2026)
by: Zhang, JiaLin, et al.
Published: (2026)
Manifold-Optimal Guidance: A Unified Riemannian Control View of Diffusion Guidance
by: Jia, Zexi, et al.
Published: (2026)
by: Jia, Zexi, et al.
Published: (2026)
One-Shot Crowd Counting With Density Guidance For Scene Adaptation
by: Chen, Jiwei, et al.
Published: (2026)
by: Chen, Jiwei, et al.
Published: (2026)
MC-LLaVA: Multi-Concept Personalized Vision-Language Model
by: An, Ruichuan, et al.
Published: (2025)
by: An, Ruichuan, et al.
Published: (2025)
Improving Viewpoint Consistency in 3D Generation via Structure Feature and CLIP Guidance
by: Zhang, Qing, et al.
Published: (2024)
by: Zhang, Qing, et al.
Published: (2024)
Generalizable Engagement Estimation in Conversation via Domain Prompting and Parallel Attention
by: Yu, Yangche, et al.
Published: (2025)
by: Yu, Yangche, et al.
Published: (2025)
WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse Weather Removal
by: Luo, Yulin, et al.
Published: (2023)
by: Luo, Yulin, et al.
Published: (2023)
Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting
by: Li, Zhiqi, et al.
Published: (2024)
by: Li, Zhiqi, et al.
Published: (2024)
Siamese Meets Diffusion Network: SMDNet for Enhanced Change Detection in High-Resolution RS Imagery
by: Jia, Jia, et al.
Published: (2024)
by: Jia, Jia, et al.
Published: (2024)
Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
by: Li, Peng, et al.
Published: (2024)
by: Li, Peng, et al.
Published: (2024)
WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation
by: Qian, Zezhong, et al.
Published: (2025)
by: Qian, Zezhong, et al.
Published: (2025)
ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
by: Li, Ying, et al.
Published: (2025)
by: Li, Ying, et al.
Published: (2025)
Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video
by: Li, Hao, et al.
Published: (2025)
by: Li, Hao, et al.
Published: (2025)
Dual-Path Learning based on Frequency Structural Decoupling and Regional-Aware Fusion for Low-Light Image Super-Resolution
by: He, Ji-Xuan, et al.
Published: (2026)
by: He, Ji-Xuan, et al.
Published: (2026)
Similar Items
-
FastInit: Fast Noise Initialization for Temporally Consistent Video Generation
by: Bai, Chengyu, et al.
Published: (2025) -
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models
by: Li, Yuming, et al.
Published: (2025) -
DesignEdit: Multi-Layered Latent Decomposition and Fusion for Unified & Accurate Image Editing
by: Jia, Yueru, et al.
Published: (2024) -
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
by: Liu, Jiaming, et al.
Published: (2023) -
World-Ego Modeling for Long-Horizon Evolution in Hybrid Embodied Tasks
by: Lin, Zuyao, et al.
Published: (2026)