:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Sichen, Zhang, Yingyi, Huang, Siming, Yi, Ran, Fan, Ke, Zhang, Ruixin, Chen, Peixian, Wang, Jun, Ding, Shouhong, Ma, Lizhuang
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2404.03518
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation
by: Liang, Shuang, et al.
Published: (2025)

Test-Time Domain Generalization for Face Anti-Spoofing
by: Zhou, Qianyu, et al.
Published: (2024)

Switchable Token-Specific Codebook Quantization For Face Image Compression
by: Wang, Yongbo, et al.
Published: (2025)

PoseAnything: Universal Pose-guided Video Generation with Part-aware Temporal Coherence
by: Wang, Ruiyan, et al.
Published: (2025)

From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning
by: Wang, Sen, et al.
Published: (2025)

IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction
by: Yi, Ran, et al.
Published: (2025)

Complementarity-Supervised Spectral-Band Routing for Multimodal Emotion Recognition
by: Huang, Zhexian, et al.
Published: (2026)

Reconstructing Topology-Consistent Face Mesh by Volume Rendering from Multi-View Images
by: Wang, Yating, et al.
Published: (2024)

Streaming Looking Ahead with Token-level Self-reward
by: Zhang, Hongming, et al.
Published: (2025)

AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius
by: Wang, Xinzhe, et al.
Published: (2024)

SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates
by: Hong, Yijia, et al.
Published: (2024)

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
by: Hu, Teng, et al.
Published: (2025)

InstanceV: Instance-Level Video Generation
by: Chen, Yuheng, et al.
Published: (2025)

D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
by: Zhang, Evelyn, et al.
Published: (2025)

Causality-aware Graph Aggregation Weight Estimator for Popularity Debiasing in Top-K Recommendation
by: Que, Yue, et al.
Published: (2025)

SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models
by: Zhao, Linglan, et al.
Published: (2024)

Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation
by: Gu, Zejun, et al.
Published: (2024)

Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation
by: Li, Guopeng, et al.
Published: (2024)

SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
by: Hu, Teng, et al.
Published: (2024)

Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration
by: Wu, Yuanchen, et al.
Published: (2025)

GloTok: Global Perspective Tokenizer for Image Reconstruction and Generation
by: Zhao, Xuan, et al.
Published: (2025)

DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
by: Sun, Ke, et al.
Published: (2024)

Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
by: Fan, Ke, et al.
Published: (2024)

Collaborative Face Experts Fusion in Video Generation: Boosting Identity Consistency Across Large Face Poses
by: Wang, Yuji, et al.
Published: (2025)

Navigating the Emotion Tree: Hierarchical Hyperbolic RAG for Multimodal Emotion Recognition
by: Wang, Zeheng, et al.
Published: (2026)

PCIE_Pose Solution for EgoExo4D Pose and Proficiency Estimation Challenge
by: Chen, Feng, et al.
Published: (2025)

Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation
by: Li, Muquan, et al.
Published: (2026)

Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
by: Shen, Yunhang, et al.
Published: (2025)

VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
by: Jiang, Pengfei, et al.
Published: (2025)

EyeSeg: An Uncertainty-Aware Eye Segmentation Framework for AR/VR
by: Peng, Zhengyuan, et al.
Published: (2025)

SCJD: Sparse Correlation and Joint Distillation for Efficient 3D Human Pose Estimation
by: Chen, Weihong, et al.
Published: (2025)

CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation
by: Lin, Xiao, et al.
Published: (2025)

LaRE$^2$: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
by: Luo, Yunpeng, et al.
Published: (2024)

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
by: Ma, Peixian, et al.
Published: (2025)

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis
by: Fan, Ke, et al.
Published: (2024)

The Fruits of Opportunism: Noncompliance and the Evolution of China's Supplemental Education Industry by Le Lin, Chicago, IL, The University of Chicago Press, 2022, 244 pp.
by: Yingyi Ma
Published: (2024)

Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?
by: Li, Muquan, et al.
Published: (2026)

Improved AdaBoost for Virtual Reality Experience Prediction Based on Long Short-Term Memory Network
by: Fan, Wenhan, et al.
Published: (2024)

One-for-More: Continual Diffusion Model for Anomaly Detection
by: Li, Xiaofan, et al.
Published: (2025)

MV-Adapter: Multi-view Consistent Image Generation Made Easy
by: Huang, Zehuan, et al.
Published: (2024)