Saved in:
| Main Authors: | Huang, Shun, Fang, Zhihua, He, Liang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.13853 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Spoken Language Identification with Pre-trained Models and Margin Loss
by: Fang, Zhihua, et al.
Published: (2026)
by: Fang, Zhihua, et al.
Published: (2026)
Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean
by: Joo, Hyunjung, et al.
Published: (2026)
by: Joo, Hyunjung, et al.
Published: (2026)
A Cross-Corpus Speech Emotion Recognition Method Based on Supervised Contrastive Learning
by: minjie, Xiang
Published: (2024)
by: minjie, Xiang
Published: (2024)
Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning
by: Batra, Arnesh, et al.
Published: (2025)
by: Batra, Arnesh, et al.
Published: (2025)
Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages
by: Sankaran, Aditya Narayan, et al.
Published: (2026)
by: Sankaran, Aditya Narayan, et al.
Published: (2026)
Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization
by: Cui, Chaoqun, et al.
Published: (2025)
by: Cui, Chaoqun, et al.
Published: (2025)
Noro: Noise-Robust One-shot Voice Conversion with Hidden Speaker Representation Learning
by: He, Haorui, et al.
Published: (2024)
by: He, Haorui, et al.
Published: (2024)
Massive Sound Embedding Benchmark (MSEB)
by: Heigold, Georg, et al.
Published: (2026)
by: Heigold, Georg, et al.
Published: (2026)
Learning More with Less: Self-Supervised Approaches for Low-Resource Speech Emotion Recognition
by: Gong, Ziwei, et al.
Published: (2025)
by: Gong, Ziwei, et al.
Published: (2025)
Contrastive Learning with Spectrum Information Augmentation in Abnormal Sound Detection
by: Meng, Xinxin, et al.
Published: (2025)
by: Meng, Xinxin, et al.
Published: (2025)
Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation
by: Hwang, Min-Jae, et al.
Published: (2024)
by: Hwang, Min-Jae, et al.
Published: (2024)
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
by: Rao, Rajath, et al.
Published: (2025)
by: Rao, Rajath, et al.
Published: (2025)
Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
by: Khaertdinov, Bulat, et al.
Published: (2024)
by: Khaertdinov, Bulat, et al.
Published: (2024)
Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models
by: Ullah, Asad, et al.
Published: (2023)
by: Ullah, Asad, et al.
Published: (2023)
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models
by: Phuong, Tuan Dat, et al.
Published: (2025)
by: Phuong, Tuan Dat, et al.
Published: (2025)
Are Sounds Sound for Phylogenetic Reconstruction?
by: Häuser, Luise, et al.
Published: (2024)
by: Häuser, Luise, et al.
Published: (2024)
Explainable Transformer-CNN Fusion for Noise-Robust Speech Emotion Recognition
by: Chakrabarty, Sudip, et al.
Published: (2025)
by: Chakrabarty, Sudip, et al.
Published: (2025)
Comparison of sEMG Encoding Accuracy Across Speech Modes Using Articulatory and Phoneme Features
by: Le, Chenqian, et al.
Published: (2026)
by: Le, Chenqian, et al.
Published: (2026)
Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation
by: Wang, Chien-Chun, et al.
Published: (2024)
by: Wang, Chien-Chun, et al.
Published: (2024)
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs
by: Song, Yuhan, et al.
Published: (2025)
by: Song, Yuhan, et al.
Published: (2025)
Thinking with Sound: Audio Chain-of-Thought Enables Multimodal Reasoning in Large Audio-Language Models
by: Xiong, Zhen, et al.
Published: (2025)
by: Xiong, Zhen, et al.
Published: (2025)
Hierarchical Self-Supervised Representation Learning for Depression Detection from Speech
by: Li, Yuxin, et al.
Published: (2025)
by: Li, Yuxin, et al.
Published: (2025)
Causally Disentangled Contrastive Learning for Multilingual Speaker Embeddings
by: Olijslager, Mariëtte, et al.
Published: (2026)
by: Olijslager, Mariëtte, et al.
Published: (2026)
Speech Separation based on Contrastive Learning and Deep Modularization
by: Ochieng, Peter
Published: (2023)
by: Ochieng, Peter
Published: (2023)
End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering
by: Hu, Jiliang, et al.
Published: (2025)
by: Hu, Jiliang, et al.
Published: (2025)
Audio Contrastive-based Fine-tuning: Decoupling Representation Learning and Classification
by: Wang, Yang, et al.
Published: (2023)
by: Wang, Yang, et al.
Published: (2023)
A Comprehensive Study on the Effectiveness of ASR Representations for Noise-Robust Speech Emotion Recognition
by: Shi, Xiaohan, et al.
Published: (2023)
by: Shi, Xiaohan, et al.
Published: (2023)
Hyperbolic Additive Margin Softmax with Hierarchical Information for Speaker Verification
by: Fang, Zhihua, et al.
Published: (2026)
by: Fang, Zhihua, et al.
Published: (2026)
XM-ALIGN: Unified Cross-Modal Embedding Alignment for Face-Voice Association
by: Fang, Zhihua, et al.
Published: (2025)
by: Fang, Zhihua, et al.
Published: (2025)
Self-Supervised Learning for Multi-Channel Neural Transducer
by: Kojima, Atsushi
Published: (2024)
by: Kojima, Atsushi
Published: (2024)
Multilingual Prosody Transfer: Comparing Supervised & Transfer Learning
by: Goel, Arnav, et al.
Published: (2024)
by: Goel, Arnav, et al.
Published: (2024)
Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation
by: Yu, Fangxu, et al.
Published: (2024)
by: Yu, Fangxu, et al.
Published: (2024)
Adapting Self-Supervised Speech Representations for Cross-lingual Dysarthria Detection in Parkinson's Disease
by: Hernandez, Abner, et al.
Published: (2026)
by: Hernandez, Abner, et al.
Published: (2026)
How Contrastive Decoding Enhances Large Audio Language Models?
by: Lin, Tzu-Quan, et al.
Published: (2026)
by: Lin, Tzu-Quan, et al.
Published: (2026)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Analyzing Multimodal Features of Spontaneous Voice Assistant Commands for Mild Cognitive Impairment Detection
by: Lin, Nana, et al.
Published: (2024)
by: Lin, Nana, et al.
Published: (2024)
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
by: Chang, Heng-Jui, et al.
Published: (2023)
by: Chang, Heng-Jui, et al.
Published: (2023)
How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
by: Wang, Shih-Heng, et al.
Published: (2024)
by: Wang, Shih-Heng, et al.
Published: (2024)
Leave No Knowledge Behind During Knowledge Distillation: Towards Practical and Effective Knowledge Distillation for Code-Switching ASR Using Realistic Data
by: Tseng, Liang-Hsuan, et al.
Published: (2024)
by: Tseng, Liang-Hsuan, et al.
Published: (2024)
Continual Speech Learning with Fused Speech Features
by: Wang, Guitao, et al.
Published: (2025)
by: Wang, Guitao, et al.
Published: (2025)
Similar Items
-
Spoken Language Identification with Pre-trained Models and Margin Loss
by: Fang, Zhihua, et al.
Published: (2026) -
Deep Supervised Contrastive Learning of Pitch Contours for Robust Pitch Accent Classification in Seoul Korean
by: Joo, Hyunjung, et al.
Published: (2026) -
A Cross-Corpus Speech Emotion Recognition Method Based on Supervised Contrastive Learning
by: minjie, Xiang
Published: (2024) -
Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learning
by: Batra, Arnesh, et al.
Published: (2025) -
Few-Shot Contrastive Adaptation for Audio Abuse Detection in Low-Resource Indic Languages
by: Sankaran, Aditya Narayan, et al.
Published: (2026)