Saved in:
| Main Authors: | Wang, Chongyu, Huang, Ting, Sun, Chunyu, Ning, Xinyu, Wang, Di, Tang, Hao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.05695 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PromptTea: Let Prompts Tell TeaCache the Optimal Threshold
by: Huang, Zishen, et al.
Published: (2025)
by: Huang, Zishen, et al.
Published: (2025)
Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation
by: Wang, Haodong, et al.
Published: (2024)
by: Wang, Haodong, et al.
Published: (2024)
OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds
by: Wang, Chongyu, et al.
Published: (2025)
by: Wang, Chongyu, et al.
Published: (2025)
Context Unrolling in Omni Models
by: Yang, Ceyuan, et al.
Published: (2026)
by: Yang, Ceyuan, et al.
Published: (2026)
DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance
by: Tang, Linfeng, et al.
Published: (2025)
by: Tang, Linfeng, et al.
Published: (2025)
Slow Perception: Let's Perceive Geometric Figures Step-by-step
by: Wei, Haoran, et al.
Published: (2024)
by: Wei, Haoran, et al.
Published: (2024)
Straightforward Layer-wise Pruning for More Efficient Visual Adaptation
by: Han, Ruizi, et al.
Published: (2024)
by: Han, Ruizi, et al.
Published: (2024)
Geometric Prior Based Deep Human Point Cloud Geometry Compression
by: Wu, Xinju, et al.
Published: (2023)
by: Wu, Xinju, et al.
Published: (2023)
Multimodal Industrial Anomaly Detection via Geometric Prior
by: Li, Min, et al.
Published: (2026)
by: Li, Min, et al.
Published: (2026)
Action-Geometry Prediction with 3D Geometric Prior for Bimanual Manipulation
by: Xu, Chongyang, et al.
Published: (2026)
by: Xu, Chongyang, et al.
Published: (2026)
Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors
by: Lu, Zhiyuan, et al.
Published: (2025)
by: Lu, Zhiyuan, et al.
Published: (2025)
Advancing the Understanding of Fine-Grained 3D Forest Structures using Digital Cousins and Simulation-to-Reality: Methods and Datasets
by: Liu, Jing, et al.
Published: (2025)
by: Liu, Jing, et al.
Published: (2025)
TELA: Text to Layer-wise 3D Clothed Human Generation
by: Dong, Junting, et al.
Published: (2024)
by: Dong, Junting, et al.
Published: (2024)
Let Language Constrain Geometry: Vision-Language Models as Semantic and Spatial Critics for 3D Generation
by: Bai, Weimin, et al.
Published: (2025)
by: Bai, Weimin, et al.
Published: (2025)
FaithFusion: Harmonizing Reconstruction and Generation via Pixel-wise Information Gain
by: Wang, YuAn, et al.
Published: (2025)
by: Wang, YuAn, et al.
Published: (2025)
Segmentation-guided Layer-wise Image Vectorization with Gradient Fills
by: Zhou, Hengyu, et al.
Published: (2024)
by: Zhou, Hengyu, et al.
Published: (2024)
LARV: Data-Free Layer-wise Adaptive Rescaling Veneer for Model Merging
by: Wang, Xinyu, et al.
Published: (2026)
by: Wang, Xinyu, et al.
Published: (2026)
PGAHum: Prior-Guided Geometry and Appearance Learning for High-Fidelity Animatable Human Reconstruction
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Enhancing Image Aesthetics with Dual-Conditioned Diffusion Models Guided by Multimodal Perception
by: Nan, Xinyu, et al.
Published: (2026)
by: Nan, Xinyu, et al.
Published: (2026)
Learning to Synergize Semantic and Geometric Priors for Limited-Data Wheat Disease Segmentation
by: Wang, Shijie, et al.
Published: (2026)
by: Wang, Shijie, et al.
Published: (2026)
Let Storytelling Tell Vivid Stories: An Expressive and Fluent Multimodal Storyteller
by: Zang, Chuanqi, et al.
Published: (2024)
by: Zang, Chuanqi, et al.
Published: (2024)
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
by: Shi, Liangtao, et al.
Published: (2025)
by: Shi, Liangtao, et al.
Published: (2025)
Vision Function Layer in Multimodal LLMs
by: Shi, Cheng, et al.
Published: (2025)
by: Shi, Cheng, et al.
Published: (2025)
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition
by: Li, Jinyuan, et al.
Published: (2024)
by: Li, Jinyuan, et al.
Published: (2024)
LaCo: Efficient Layer-wise Compression of Visual Tokens for Multimodal Large Language Models
by: Liu, Juntao, et al.
Published: (2025)
by: Liu, Juntao, et al.
Published: (2025)
ELSA: Exploiting Layer-wise N:M Sparsity for Vision Transformer Acceleration
by: Huang, Ning-Chi, et al.
Published: (2024)
by: Huang, Ning-Chi, et al.
Published: (2024)
Layer-wise Instance Binding for Regional and Occlusion Control in Text-to-Image Diffusion Transformers
by: Chen, Ruidong, et al.
Published: (2026)
by: Chen, Ruidong, et al.
Published: (2026)
AHAP: Reconstructing Arbitrary Humans from Arbitrary Perspectives with Geometric Priors
by: Qiao, Xiaozhen, et al.
Published: (2026)
by: Qiao, Xiaozhen, et al.
Published: (2026)
Sparse Gain Radio Map Reconstruction With Geometry Priors and Uncertainty-Guided Measurement Selection
by: Zeng, Zhihan, et al.
Published: (2026)
by: Zeng, Zhihan, et al.
Published: (2026)
Dragging with Geometry: From Pixels to Geometry-Guided Image Editing
by: Pu, Xinyu, et al.
Published: (2025)
by: Pu, Xinyu, et al.
Published: (2025)
RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation
by: Guan, Cheng, et al.
Published: (2026)
by: Guan, Cheng, et al.
Published: (2026)
Unlocking the Potential of Difficulty Prior in RL-based Multimodal Reasoning
by: Chen, Mingrui, et al.
Published: (2025)
by: Chen, Mingrui, et al.
Published: (2025)
SpatialGeo:Boosting Spatial Reasoning in Multimodal LLMs via Geometry-Semantics Fusion
by: Guo, Jiajie, et al.
Published: (2025)
by: Guo, Jiajie, et al.
Published: (2025)
LISA: A Layer-wise Integration and Suppression Approach for Hallucination Mitigation in Multimodal Large Language Models
by: Guo, Zhihui, et al.
Published: (2025)
by: Guo, Zhihui, et al.
Published: (2025)
GUIDE: A Guideline-Guided Dataset for Instructional Video Comprehension
by: Liang, Jiafeng, et al.
Published: (2024)
by: Liang, Jiafeng, et al.
Published: (2024)
VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors
by: Tang, Jimin, et al.
Published: (2026)
by: Tang, Jimin, et al.
Published: (2026)
Advancing Structured Priors for Sparse-Voxel Surface Reconstruction
by: Chi, Ting-Hsun, et al.
Published: (2026)
by: Chi, Ting-Hsun, et al.
Published: (2026)
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
by: Zhu, Lingyu, et al.
Published: (2024)
by: Zhu, Lingyu, et al.
Published: (2024)
Unrolled Reconstruction with Integrated Super-Resolution for Accelerated 3D LGE MRI
by: Hisham, Md Hasibul Husain, et al.
Published: (2026)
by: Hisham, Md Hasibul Husain, et al.
Published: (2026)
3D CoCa: Contrastive Learners are 3D Captioners
by: Huang, Ting, et al.
Published: (2025)
by: Huang, Ting, et al.
Published: (2025)
Similar Items
-
PromptTea: Let Prompts Tell TeaCache the Optimal Threshold
by: Huang, Zishen, et al.
Published: (2025) -
Efficiently Expanding Receptive Fields: Local Split Attention and Parallel Aggregation for Enhanced Large-scale Point Cloud Semantic Segmentation
by: Wang, Haodong, et al.
Published: (2024) -
OpenUrban3D: Annotation-Free Open-Vocabulary Semantic Segmentation of Large-Scale Urban Point Clouds
by: Wang, Chongyu, et al.
Published: (2025) -
Context Unrolling in Omni Models
by: Yang, Ceyuan, et al.
Published: (2026) -
DSPFusion: Image Fusion via Degradation and Semantic Dual-Prior Guidance
by: Tang, Linfeng, et al.
Published: (2025)