:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Sunan, Lian, Hailun, Lu, Cheng, Zhao, Yan, Qi, Tianhua, Yang, Hao, Zong, Yuan, Zheng, Wenming
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2407.12973
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion
by: Qi, Tianhua, et al.
Published: (2024)

Incorporating Scene Context and Semantic Labels for Enhanced Group-level Emotion Recognition
by: Zhu, Qing, et al.
Published: (2025)

Signal-SGN: A Spiking Graph Convolutional Network for Skeletal Action Recognition via Learning Temporal-Frequency Dynamics
by: Zheng, Naichuan, et al.
Published: (2024)

Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions
by: Sun, Licai, et al.
Published: (2025)

SNN-Driven Multimodal Human Action Recognition via Sparse Spatial-Temporal Data Fusion
by: Zheng, Naichuan, et al.
Published: (2025)

Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation
by: Lu, Cheng, et al.
Published: (2024)

Multi-modal Speech Emotion Recognition via Feature Distribution Adaptation Network
by: Li, Shaokai, et al.
Published: (2024)

Feature-Based Dual Visual Feature Extraction Model for Compound Multimodal Emotion Recognition
by: Liu, Ran, et al.
Published: (2025)

Patch as Node: Human-Centric Graph Representation Learning for Multimodal Action Recognition
by: Liang, Zeyu, et al.
Published: (2025)

MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition
by: Zheng, Naichuan, et al.
Published: (2024)

Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)

Spatial Hierarchy and Temporal Attention Guided Cross Masking for Self-supervised Skeleton-based Action Recognition
by: Yin, Xinpeng, et al.
Published: (2024)

Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition
by: Wang, Yong, et al.
Published: (2024)

Uncertainty-Aware Label Refinement on Hypergraphs for Personalized Federated Facial Expression Recognition
by: Ding, Hu, et al.
Published: (2025)

Towards A Robust Group-level Emotion Recognition via Uncertainty-Aware Learning
by: Zhu, Qing, et al.
Published: (2023)

A Survey of Deep Learning for Group-level Emotion Recognition
by: Huang, Xiaohua, et al.
Published: (2024)

GPT-4V with Emotion: A Zero-shot Benchmark for Generalized Emotion Recognition
by: Lian, Zheng, et al.
Published: (2023)

MapChange: Enhancing Semantic Change Detection with Temporal-Invariant Historical Maps Based on Deep Triplet Network
by: Liu, Yinhe, et al.
Published: (2024)

Topological Symmetry Enhanced Graph Convolution for Skeleton-Based Action Recognition
by: Liang, Zeyu, et al.
Published: (2024)

LiquidTAD: Efficient Temporal Action Detection via Parallel Liquid-Inspired Temporal Relaxation
by: Sun, Zepeng, et al.
Published: (2026)

MoSFormer: Augmenting Temporal Context with Memory of Surgery for Surgical Phase Recognition
by: Ding, Hao, et al.
Published: (2025)

Towards Realistic Emotional Voice Conversion using Controllable Emotional Intensity
by: Qi, Tianhua, et al.
Published: (2024)

Boosting Continuous Emotion Recognition with Self-Pretraining using Masked Autoencoders, Temporal Convolutional Networks, and Transformers
by: Zhou, Weiwei, et al.
Published: (2024)

Emotion Recognition Using Convolutional Neural Networks
by: Xu, Shaoyuan, et al.
Published: (2025)

Partial Label Learning for Emotion Recognition from EEG
by: Zhang, Guangyi, et al.
Published: (2023)

Emotion-Qwen: A Unified Framework for Emotion and Vision Understanding
by: Huang, Dawei, et al.
Published: (2025)

A Chain of Diagnosis Framework for Accurate and Explainable Radiology Report Generation
by: Jin, Haibo, et al.
Published: (2025)

Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Incomplete Data Scenarios
by: Fan, Qi, et al.
Published: (2023)

Textualized and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild
by: Richet, Nicolas, et al.
Published: (2024)

A Multimodal Fusion Network For Student Emotion Recognition Based on Transformer and Tensor Product
by: Xiang, Ao, et al.
Published: (2024)

Pose Magic: Efficient and Temporally Consistent Human Pose Estimation with a Hybrid Mamba-GCN Network
by: Zhang, Xinyi, et al.
Published: (2024)

Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
by: Xie, Jianyang, et al.
Published: (2025)

S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition
by: Zheng, Naichuan, et al.
Published: (2026)

Semi-Supervised Hyperspectral Image Classification with Edge-Aware Superpixel Label Propagation and Adaptive Pseudo-Labeling
by: Qiu, Yunfei, et al.
Published: (2026)

FEALLM: Advancing Facial Emotion Analysis in Multimodal Large Language Models with Emotional Synergy and Reasoning
by: Hu, Zhuozhao, et al.
Published: (2025)

Decoupled Doubly Contrastive Learning for Cross Domain Facial Action Unit Detection
by: Li, Yong, et al.
Published: (2025)

Self-Supervised Place Recognition by Refining Temporal and Featural Pseudo Labels from Panoramic Data
by: Chen, Chao, et al.
Published: (2022)

Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification
by: Zhao, Lian, et al.
Published: (2024)

Dual Stream Independence Decoupling for True Emotion Recognition under Masked Expressions
by: Wei, Jinsheng, et al.
Published: (2026)

A Survey on Facial Expression Recognition of Static and Dynamic Emotions
by: Wang, Yan, et al.
Published: (2024)