:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhou, Hantao, Yang, Rui, Zhang, Yachao, Duan, Haoran, Huang, Yawen, Hu, Runze, Li, Xiu, Zheng, Yefeng
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2309.13242
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment
by: Zhou, Hantao, et al.
Published: (2024)

Video Object Segmentation with Dynamic Query Modulation
by: Zhou, Hantao, et al.
Published: (2024)

Gamma: Toward Generic Image Assessment with Mixture of Assessment Experts
by: Zhou, Hantao, et al.
Published: (2025)

Prototype Correlation Matching and Class-Relation Reasoning for Few-Shot Medical Image Segmentation
by: Zhang, Yumin, et al.
Published: (2024)

CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video
by: Miao, Xingyu, et al.
Published: (2024)

ConRF: Zero-shot Stylization of 3D Scenes with Conditioned Radiation Fields
by: Miao, Xingyu, et al.
Published: (2024)

Wearable-based behaviour interpolation for semi-supervised human activity recognition
by: Duan, Haoran, et al.
Published: (2024)

Rethinking Brain Tumor Segmentation from the Frequency Domain Perspective
by: Shao, Minye, et al.
Published: (2025)

UniADC: A Unified Framework for Anomaly Detection and Classification
by: Zhang, Ximiao, et al.
Published: (2025)

AnomalyXFusion: Multi-modal Anomaly Synthesis with Diffusion
by: Hu, Jie, et al.
Published: (2024)

SP-SLAM: Neural Real-Time Dense SLAM With Scene Priors
by: Hong, Zhen, et al.
Published: (2025)

TRACE: Temporally Reliable Anatomically-Conditioned 3D CT Generation with Enhanced Efficiency
by: Shao, Minye, et al.
Published: (2025)

NeRF2Points: Large-Scale Point Cloud Generation From Street Views' Radiance Field Optimization
by: Tu, Peng, et al.
Published: (2024)

UniAV: Unified Audio-Visual Perception for Multi-Task Video Event Localization
by: Geng, Tiantian, et al.
Published: (2024)

X-ray Insights Unleashed: Pioneering the Enhancement of Multi-Label Long-Tail Data
by: Yang, Xinquan, et al.
Published: (2025)

Decoupled Sensitivity-Consistency Learning for Weakly Supervised Video Anomaly Detection
by: Zheng, Hantao, et al.
Published: (2026)

Two Heads are Better than One: Robust Learning Meets Multi-branch Models
by: Zhang, Zongyuan, et al.
Published: (2022)

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
by: Zhou, Zhenglin, et al.
Published: (2024)

URoadNet: Dual Sparse Attentive U-Net for Multiscale Road Network Extraction
by: Song, Jie, et al.
Published: (2024)

RTHDet: Rotate Table Area and Head Detection in images
by: Hu, Wenxing, et al.
Published: (2023)

TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection
by: Xiong, Xinqi, et al.
Published: (2025)

M2StyleGS: Multi-Modality 3D Style Transfer with Gaussian Splatting
by: Miao, Xingyu, et al.
Published: (2026)

HeadHunt-VAD: Hunting Robust Anomaly-Sensitive Heads in MLLM for Tuning-Free Video Anomaly Detection
by: Cai, Zhaolin, et al.
Published: (2025)

UniAlignment: Semantic Alignment for Unified Image Generation, Understanding, Manipulation and Perception
by: Song, Xinyang, et al.
Published: (2025)

Improving Vision Transformers by Overlapping Heads in Multi-Head Self-Attention
by: Zhang, Tianxiao, et al.
Published: (2024)

SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads
by: Yu, Tan, et al.
Published: (2026)

UniT: Unified Geometry Learning with Group Autoregressive Transformer
by: Wang, Haotian, et al.
Published: (2026)

MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception
by: Liu, Wenzhuo, et al.
Published: (2025)

Uni-MMMU: A Massive Multi-discipline Multimodal Unified Benchmark
by: Zou, Kai, et al.
Published: (2025)

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
by: AI, Inclusion, et al.
Published: (2025)

UniM$^2$AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
by: Zou, Jian, et al.
Published: (2023)

HeadGAP: Few-Shot 3D Head Avatar via Generalizable Gaussian Priors
by: Zheng, Xiaozheng, et al.
Published: (2024)

UniPR: Unified Object-level Real-to-Sim Perception and Reconstruction from a Single Stereo Pair
by: Zhang, Chuanrui, et al.
Published: (2026)

UniHDA: A Unified and Versatile Framework for Multi-Modal Hybrid Domain Adaptation
by: Li, Hengjia, et al.
Published: (2024)

Radiology Report Generation for Low-Quality X-Ray Images
by: Zhu, Hongze, et al.
Published: (2026)

Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head
by: Yang, Penghui, et al.
Published: (2024)

Structure Observation Driven Image-Text Contrastive Learning for Computed Tomography Report Generation
by: Liu, Hong, et al.
Published: (2026)

StyleTalk++: A Unified Framework for Controlling the Speaking Styles of Talking Heads
by: Wang, Suzhen, et al.
Published: (2024)

UniRecGen: Unifying Multi-View 3D Reconstruction and Generation
by: Huang, Zhisheng, et al.
Published: (2026)

Jump Cut Smoothing for Talking Heads
by: Wang, Xiaojuan, et al.
Published: (2024)