:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Liu, Jing, Guo, Zhengliang, Wang, Yan, Zhu, Xiaoguang, Du, Yao, Wang, Zehua, Leung, Victor C. M.
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2603.19337
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

D3S2: Diffusion-Guided Dataset Distillation for Semantic Segmentation
di: Zheng, Wenjie, et al.
Pubblicazione: (2026)

Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment
di: Wu, Jiaqi, et al.
Pubblicazione: (2024)

Harnessing Group-Oriented Consistency Constraints for Semi-Supervised Semantic Segmentation in CdZnTe Semiconductors
di: Li, Peihao, et al.
Pubblicazione: (2025)

A Dual-way Enhanced Framework from Text Matching Point of View for Multimodal Entity Linking
di: Song, Shezheng, et al.
Pubblicazione: (2023)

GarmentDiffusion: 3D Garment Sewing Pattern Generation with Multimodal Diffusion Transformers
di: Li, Xinyu, et al.
Pubblicazione: (2025)

POCI-Diff: Position Objects Consistently and Interactively with 3D-Layout Guided Diffusion
di: Rigo, Andrea, et al.
Pubblicazione: (2026)

Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion
di: Lv, Zheqi, et al.
Pubblicazione: (2025)

CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic Decoding
di: Zhou, Qiongyi, et al.
Pubblicazione: (2024)

The Thinking Pixel: Recursive Sparse Reasoning in Multimodal Diffusion Latents
di: Sun, Yuwei, et al.
Pubblicazione: (2026)

Reward Guided Latent Consistency Distillation
di: Li, Jiachen, et al.
Pubblicazione: (2024)

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
di: Jia, Chengyou, et al.
Pubblicazione: (2023)

CLIPin: A Non-contrastive Plug-in to CLIP for Multimodal Semantic Alignment
di: Yang, Shengzhu, et al.
Pubblicazione: (2025)

Affinity-Graph-Guided Contractive Learning for Pretext-Free Medical Image Segmentation with Minimal Annotation
di: Cheng, Zehua, et al.
Pubblicazione: (2024)

Semantic Localization Guiding Segment Anything Model For Reference Remote Sensing Image Segmentation
di: Li, Shuyang, et al.
Pubblicazione: (2025)

Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization
di: Wu, Jiulong, et al.
Pubblicazione: (2025)

Homogeneous and Heterogeneous Consistency progressive Re-ranking for Visible-Infrared Person Re-identification
di: Wang, Yiming
Pubblicazione: (2026)

PointGS: Semantic-Consistent Unsupervised 3D Point Cloud Segmentation with 3D Gaussian Splatting
di: Song, Yixiao, et al.
Pubblicazione: (2026)

DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion
di: He, Huiguo, et al.
Pubblicazione: (2024)

SVGDreamer: Text Guided SVG Generation with Diffusion Model
di: Xing, Ximing, et al.
Pubblicazione: (2023)

Reference-Guided Diffusion Inpainting For Multimodal Counterfactual Generation
di: Buburuzan, Alexandru
Pubblicazione: (2025)

AnchorDS: Anchoring Dynamic Sources for Semantically Consistent Text-to-3D Generation
di: Zhu, Jiayin, et al.
Pubblicazione: (2025)

CRCL: Causal Representation Consistency Learning for Anomaly Detection in Surveillance Videos
di: Liu, Yang, et al.
Pubblicazione: (2025)

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
di: Xiong, Lexiang, et al.
Pubblicazione: (2025)

Robust Polyp Detection and Diagnosis through Compositional Prompt-Guided Diffusion Models
di: Yu, Jia, et al.
Pubblicazione: (2025)

MCITlib: Multimodal Continual Instruction Tuning Library and Benchmark
di: Guo, Haiyang, et al.
Pubblicazione: (2025)

Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning
di: Zhang, Zhenyu, et al.
Pubblicazione: (2023)

Phys4D: Fine-Grained Physics-Consistent 4D Modeling from Video Diffusion
di: Lu, Haoran, et al.
Pubblicazione: (2026)

GS-ID: Illumination Decomposition on Gaussian Splatting via Adaptive Light Aggregation and Diffusion-Guided Material Priors
di: Du, Kang, et al.
Pubblicazione: (2024)

Text-Guided Layer Fusion Mitigates Hallucination in Multimodal LLMs
di: Lin, Chenchen, et al.
Pubblicazione: (2026)

Timeline and Boundary Guided Diffusion Network for Video Shadow Detection
di: Zhou, Haipeng, et al.
Pubblicazione: (2024)

TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation
di: Lin, Gaoren, et al.
Pubblicazione: (2025)

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation
di: Chen, Xiyi, et al.
Pubblicazione: (2024)

Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models
di: Kim, Keuntae, et al.
Pubblicazione: (2026)

MuDD: A Multimodal Deception Detection Dataset and GSR-Guided Progressive Distillation for Non-Contact Deception Detection
di: Jiang, Peiyuan, et al.
Pubblicazione: (2026)

A Modern Look at Simplicity Bias in Image Classification Tasks
di: Chang, Xiaoguang, et al.
Pubblicazione: (2025)

DECADE: A Temporally-Consistent Unsupervised Diffusion Model for Enhanced Rb-82 Dynamic Cardiac PET Image Denoising
di: Zhou, Yinchi, et al.
Pubblicazione: (2026)

Task Consistent Prototype Learning for Incremental Few-shot Semantic Segmentation
di: Xu, Wenbo, et al.
Pubblicazione: (2024)

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID
di: He, Lingfeng, et al.
Pubblicazione: (2024)

PROSPECT: Unified Streaming Vision-Language Navigation via Semantic--Spatial Fusion and Latent Predictive Representation
di: Fan, Zehua, et al.
Pubblicazione: (2026)

The Role of Visual Modality in Multimodal Mathematical Reasoning: Challenges and Insights
di: Liu, Yufang, et al.
Pubblicazione: (2025)