Saved in:
| Main Authors: | Chen, Sichen, Zhang, Yingyi, Huang, Siming, Yi, Ran, Fan, Ke, Zhang, Ruixin, Chen, Peixian, Wang, Jun, Ding, Shouhong, Ma, Lizhuang |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.03518 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation
by: Liang, Shuang, et al.
Published: (2025)
by: Liang, Shuang, et al.
Published: (2025)
Test-Time Domain Generalization for Face Anti-Spoofing
by: Zhou, Qianyu, et al.
Published: (2024)
by: Zhou, Qianyu, et al.
Published: (2024)
Switchable Token-Specific Codebook Quantization For Face Image Compression
by: Wang, Yongbo, et al.
Published: (2025)
by: Wang, Yongbo, et al.
Published: (2025)
PoseAnything: Universal Pose-guided Video Generation with Part-aware Temporal Coherence
by: Wang, Ruiyan, et al.
Published: (2025)
by: Wang, Ruiyan, et al.
Published: (2025)
From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning
by: Wang, Sen, et al.
Published: (2025)
by: Wang, Sen, et al.
Published: (2025)
IAR2: Improving Autoregressive Visual Generation with Semantic-Detail Associated Token Prediction
by: Yi, Ran, et al.
Published: (2025)
by: Yi, Ran, et al.
Published: (2025)
Complementarity-Supervised Spectral-Band Routing for Multimodal Emotion Recognition
by: Huang, Zhexian, et al.
Published: (2026)
by: Huang, Zhexian, et al.
Published: (2026)
Reconstructing Topology-Consistent Face Mesh by Volume Rendering from Multi-View Images
by: Wang, Yating, et al.
Published: (2024)
by: Wang, Yating, et al.
Published: (2024)
Streaming Looking Ahead with Token-level Self-reward
by: Zhang, Hongming, et al.
Published: (2025)
by: Zhang, Hongming, et al.
Published: (2025)
AdR-Gaussian: Accelerating Gaussian Splatting with Adaptive Radius
by: Wang, Xinzhe, et al.
Published: (2024)
by: Wang, Xinzhe, et al.
Published: (2024)
SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates
by: Hong, Yijia, et al.
Published: (2024)
by: Hong, Yijia, et al.
Published: (2024)
Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
by: Hu, Teng, et al.
Published: (2025)
by: Hu, Teng, et al.
Published: (2025)
InstanceV: Instance-Level Video Generation
by: Chen, Yuheng, et al.
Published: (2025)
by: Chen, Yuheng, et al.
Published: (2025)
D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning
by: Zhang, Evelyn, et al.
Published: (2025)
by: Zhang, Evelyn, et al.
Published: (2025)
Causality-aware Graph Aggregation Weight Estimator for Popularity Debiasing in Top-K Recommendation
by: Que, Yue, et al.
Published: (2025)
by: Que, Yue, et al.
Published: (2025)
SAFE: Slow and Fast Parameter-Efficient Tuning for Continual Learning with Pre-Trained Models
by: Zhao, Linglan, et al.
Published: (2024)
by: Zhao, Linglan, et al.
Published: (2024)
Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation
by: Gu, Zejun, et al.
Published: (2024)
by: Gu, Zejun, et al.
Published: (2024)
Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation
by: Li, Guopeng, et al.
Published: (2024)
by: Li, Guopeng, et al.
Published: (2024)
SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation
by: Hu, Teng, et al.
Published: (2024)
by: Hu, Teng, et al.
Published: (2024)
Towards Rationale-Answer Alignment of LVLMs via Self-Rationale Calibration
by: Wu, Yuanchen, et al.
Published: (2025)
by: Wu, Yuanchen, et al.
Published: (2025)
GloTok: Global Perspective Tokenizer for Image Reconstruction and Generation
by: Zhao, Xuan, et al.
Published: (2025)
by: Zhao, Xuan, et al.
Published: (2025)
DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
by: Sun, Ke, et al.
Published: (2024)
by: Sun, Ke, et al.
Published: (2024)
Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation
by: Fan, Ke, et al.
Published: (2024)
by: Fan, Ke, et al.
Published: (2024)
Collaborative Face Experts Fusion in Video Generation: Boosting Identity Consistency Across Large Face Poses
by: Wang, Yuji, et al.
Published: (2025)
by: Wang, Yuji, et al.
Published: (2025)
Navigating the Emotion Tree: Hierarchical Hyperbolic RAG for Multimodal Emotion Recognition
by: Wang, Zeheng, et al.
Published: (2026)
by: Wang, Zeheng, et al.
Published: (2026)
PCIE_Pose Solution for EgoExo4D Pose and Proficiency Estimation Challenge
by: Chen, Feng, et al.
Published: (2025)
by: Chen, Feng, et al.
Published: (2025)
Fixed Anchors Are Not Enough: Dynamic Retrieval and Persistent Homology for Dataset Distillation
by: Li, Muquan, et al.
Published: (2026)
by: Li, Muquan, et al.
Published: (2026)
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
by: Shen, Yunhang, et al.
Published: (2025)
by: Shen, Yunhang, et al.
Published: (2025)
VISA: Group-wise Visual Token Selection and Aggregation via Graph Summarization for Efficient MLLMs Inference
by: Jiang, Pengfei, et al.
Published: (2025)
by: Jiang, Pengfei, et al.
Published: (2025)
EyeSeg: An Uncertainty-Aware Eye Segmentation Framework for AR/VR
by: Peng, Zhengyuan, et al.
Published: (2025)
by: Peng, Zhengyuan, et al.
Published: (2025)
SCJD: Sparse Correlation and Joint Distillation for Efficient 3D Human Pose Estimation
by: Chen, Weihong, et al.
Published: (2025)
by: Chen, Weihong, et al.
Published: (2025)
CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation
by: Lin, Xiao, et al.
Published: (2025)
by: Lin, Xiao, et al.
Published: (2025)
LaRE$^2$: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
by: Luo, Yunpeng, et al.
Published: (2024)
by: Luo, Yunpeng, et al.
Published: (2024)
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
by: Ma, Peixian, et al.
Published: (2025)
by: Ma, Peixian, et al.
Published: (2025)
FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis
by: Fan, Ke, et al.
Published: (2024)
by: Fan, Ke, et al.
Published: (2024)
The Fruits of Opportunism: Noncompliance and the Evolution of China's Supplemental Education Industry by Le Lin, Chicago, IL, The University of Chicago Press, 2022, 244 pp.
by: Yingyi Ma
Published: (2024)
by: Yingyi Ma
Published: (2024)
Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?
by: Li, Muquan, et al.
Published: (2026)
by: Li, Muquan, et al.
Published: (2026)
Improved AdaBoost for Virtual Reality Experience Prediction Based on Long Short-Term Memory Network
by: Fan, Wenhan, et al.
Published: (2024)
by: Fan, Wenhan, et al.
Published: (2024)
One-for-More: Continual Diffusion Model for Anomaly Detection
by: Li, Xiaofan, et al.
Published: (2025)
by: Li, Xiaofan, et al.
Published: (2025)
MV-Adapter: Multi-view Consistent Image Generation Made Easy
by: Huang, Zehuan, et al.
Published: (2024)
by: Huang, Zehuan, et al.
Published: (2024)
Similar Items
-
SDPose: Exploiting Diffusion Priors for Out-of-Domain and Robust Pose Estimation
by: Liang, Shuang, et al.
Published: (2025) -
Test-Time Domain Generalization for Face Anti-Spoofing
by: Zhou, Qianyu, et al.
Published: (2024) -
Switchable Token-Specific Codebook Quantization For Face Image Compression
by: Wang, Yongbo, et al.
Published: (2025) -
PoseAnything: Universal Pose-guided Video Generation with Part-aware Temporal Coherence
by: Wang, Ruiyan, et al.
Published: (2025) -
From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning
by: Wang, Sen, et al.
Published: (2025)