:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wei, Riling, Chen, Hanjie, Yao, Kelu, Yang, Chuanguang, Wang, Jun, Li, Chao
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2501.01983
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Asymmetric Cross-Modal Knowledge Distillation: Bridging Modalities with Weak Semantic Consistency
by: Wei, Riling, et al.
Published: (2025)

AHPA: Adaptive Hierarchical Prior Alignment for Diffusion Transformers
by: Min, Ruibin, et al.
Published: (2026)

Boosting Medical Image Synthesis via Registration-guided Consistency and Disentanglement Learning
by: Li, Chuanpu, et al.
Published: (2024)

LAYOUTDREAMER: Physics-guided Layout for Text-to-3D Compositional Scene Generation
by: Zhou, Yang, et al.
Published: (2025)

Fine-tuning an ECG Foundation Model to Predict Coronary CT Angiography Outcomes
by: Xiao, Yujie, et al.
Published: (2025)

Comparing Deep Neural Network for Multi-Label ECG Diagnosis From Scanned ECG
by: Nguyen, Cuong V., et al.
Published: (2025)

FOLK: Fast Open-Vocabulary 3D Instance Segmentation via Label-guided Knowledge Distillation
by: Wu, Hongrui, et al.
Published: (2025)

DeepSport: A Multimodal Large Language Model for Comprehensive Sports Video Reasoning via Agentic Reinforcement Learning
by: Zou, Junbo, et al.
Published: (2025)

Mutual Information guided Visual Contrastive Learning
by: Chen, Hanyang, et al.
Published: (2025)

BenchCAD: A Comprehensive, Industry-Standard Benchmark for Programmatic CAD
by: Zhang, Haozhe, et al.
Published: (2026)

AnyMS: Bottom-up Attention Decoupling for Layout-guided and Training-free Multi-subject Customization
by: Yu, Binhe, et al.
Published: (2025)

Find Them All: Unveiling MLLMs for Versatile Person Re-identification
by: Li, Jinhao, et al.
Published: (2025)

Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting
by: Li, Yuqi, et al.
Published: (2025)

A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
by: Chen, Yongfan, et al.
Published: (2025)

Depth-guided Texture Diffusion for Image Semantic Segmentation
by: Sun, Wei, et al.
Published: (2024)

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection
by: Chen, Huafeng, et al.
Published: (2024)

XCoOp: Explainable Prompt Learning for Computer-Aided Diagnosis via Concept-guided Context Optimization
by: Bie, Yequan, et al.
Published: (2024)

Following the Diagnostic Trace: Visual Cognition-guided Cooperative Network for Chest X-Ray Diagnosis
by: Wu, Shaoxuan, et al.
Published: (2026)

Using Vision Language Models to Detect Students' Academic Emotion through Facial Expressions
by: Wang, Deliang, et al.
Published: (2025)

Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction
by: Che, Henry, et al.
Published: (2026)

Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images
by: Ramzan, Farheen, et al.
Published: (2025)

Deep Imbalanced Regression to Estimate Vascular Age from PPG Data: a Novel Digital Biomarker for Cardiovascular Health
by: Nie, Guangkun, et al.
Published: (2024)

Boost Self-Supervised Dataset Distillation via Parameterization, Predefined Augmentation, and Approximation
by: Yu, Sheng-Feng, et al.
Published: (2025)

Socratic Questioning: Learn to Self-guide Multimodal Reasoning in the Wild
by: Hu, Wanpeng, et al.
Published: (2025)

C3-Diff: Super-resolving Spatial Transcriptomics via Cross-modal Cross-content Contrastive Diffusion Modelling
by: Wang, Xiaofei, et al.
Published: (2025)

CROP: Expert-Aligned Image Cropping via Compositional Reasoning and Optimizing Preference
by: Dong, Zhitong, et al.
Published: (2026)

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models
by: Fan, Jiawei, et al.
Published: (2026)

CoilDrop-MRI: Self-supervised physics-guided MRI reconstruction with coil dropout
by: Song, Tongxi, et al.
Published: (2026)

FLUID: Training-Free Face De-identification via Latent Identity Substitution
by: Park, Jinhyeong, et al.
Published: (2025)

Prompt Decoupling for Text-to-Image Person Re-identification
by: Li, Weihao, et al.
Published: (2024)

Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian
by: Sun, Wei, et al.
Published: (2024)

Mixture-of-Visual-Thoughts: Exploring Context-Adaptive Reasoning Mode Selection for General Visual Reasoning
by: Li, Zejun, et al.
Published: (2025)

CogStream: Context-guided Streaming Video Question Answering
by: Zhao, Zicheng, et al.
Published: (2025)

EmambaIR: Efficient Visual State Space Model for Event-guided Image Reconstruction
by: Yu, Wei, et al.
Published: (2026)

How Bias Binds: Measuring Hidden Associations for Bias Control in Text-to-Image Compositions
by: Li, Jeng-Lin, et al.
Published: (2025)

Depth-guided NeRF Training via Earth Mover's Distance
by: Rau, Anita, et al.
Published: (2024)

Tera-MIND: Tera-scale mouse brain simulation via spatial mRNA-guided diffusion
by: Wu, Jiqing, et al.
Published: (2025)

Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs
by: Han, Kai, et al.
Published: (2024)

Pic2Diagnosis: A Method for Diagnosis of Cardiovascular Diseases from the Printed ECG Pictures
by: Büyüksolak, Oğuzhan, et al.
Published: (2025)

VGNC: Reducing the Overfitting of Sparse-view 3DGS via Validation-guided Gaussian Number Control
by: Lin, Lifeng, et al.
Published: (2025)