Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Lee, Junghyun, Kim, Hyunseo, Jang, Hanna, Noh, Junhyug
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computer Vision and Pattern Recognition Artificial Intelligence
Online-Zugang:	https://arxiv.org/abs/2605.21417
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866913159177043968
author	Lee, Junghyun Kim, Hyunseo Jang, Hanna Noh, Junhyug
author_facet	Lee, Junghyun Kim, Hyunseo Jang, Hanna Noh, Junhyug
contents	Blended emotion recognition is challenging because emotions are often expressed as mixtures of subtle and overlapping multimodal cues rather than a single dominant signal. We propose a rank-aware multi-encoder framework that selectively combines complementary representations from diverse pre-extracted video and audio encoders. Our method projects heterogeneous encoder features into a shared latent space, estimates sample-wise encoder importance through an attention-based gating module, and fuses only the top-n most informative encoders. To better model blended emotions, we decouple prediction into presence and salience heads and align them through probability-level fusion. We further incorporate feature-level unsupervised domain adaptation without pseudo-labeling to improve robustness under distribution shift. Experiments on the BlEmoRE challenge show that the proposed framework outperforms strong individual encoders and naïve multi-encoder fusion baselines. Our final system ranked 2nd in the competition, supporting the effectiveness of rank-aware selective fusion for fine-grained blended emotion recognition.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_21417
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Ordering Matters: Rank-Aware Selective Fusion for Blended Emotion Recognition Lee, Junghyun Kim, Hyunseo Jang, Hanna Noh, Junhyug Computer Vision and Pattern Recognition Artificial Intelligence Blended emotion recognition is challenging because emotions are often expressed as mixtures of subtle and overlapping multimodal cues rather than a single dominant signal. We propose a rank-aware multi-encoder framework that selectively combines complementary representations from diverse pre-extracted video and audio encoders. Our method projects heterogeneous encoder features into a shared latent space, estimates sample-wise encoder importance through an attention-based gating module, and fuses only the top-n most informative encoders. To better model blended emotions, we decouple prediction into presence and salience heads and align them through probability-level fusion. We further incorporate feature-level unsupervised domain adaptation without pseudo-labeling to improve robustness under distribution shift. Experiments on the BlEmoRE challenge show that the proposed framework outperforms strong individual encoders and naïve multi-encoder fusion baselines. Our final system ranked 2nd in the competition, supporting the effectiveness of rank-aware selective fusion for fine-grained blended emotion recognition.
title	Ordering Matters: Rank-Aware Selective Fusion for Blended Emotion Recognition
topic	Computer Vision and Pattern Recognition Artificial Intelligence
url	https://arxiv.org/abs/2605.21417

Ähnliche Einträge