Saved in:
| Main Authors: | Liang, Yayun, Zhang, Yuanming, Chen, Fei, Lu, Jing, Lin, Zhibin |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.20542 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Decoding Stimulus Reconstruction-Based Auditory Attention Robustly in Unbalanced EEG Datasets
by: Zhang, Yuanming, et al.
Published: (2026)
by: Zhang, Yuanming, et al.
Published: (2026)
Multi-class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum
by: Zhang, Yuanming, et al.
Published: (2024)
by: Zhang, Yuanming, et al.
Published: (2024)
Auditory Attention Decoding from Ear-EEG Signals: A Dataset with Dynamic Attention Switching and Rigorous Cross-Validation
by: Zhang, Yuanming, et al.
Published: (2025)
by: Zhang, Yuanming, et al.
Published: (2025)
A Lightweight Hybrid Dual Channel Speech Enhancement System under Low-SNR Conditions
by: Wang, Zheng, et al.
Published: (2025)
by: Wang, Zheng, et al.
Published: (2025)
Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring
by: Webber, Jacob J, et al.
Published: (2025)
by: Webber, Jacob J, et al.
Published: (2025)
Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
by: Shin, Ui-Hyeop, et al.
Published: (2026)
by: Shin, Ui-Hyeop, et al.
Published: (2026)
Noise-Aware Speech Separation with Contrastive Learning
by: Zhang, Zizheng, et al.
Published: (2023)
by: Zhang, Zizheng, et al.
Published: (2023)
Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization
by: Wan, Genshun, et al.
Published: (2026)
by: Wan, Genshun, et al.
Published: (2026)
Exploiting Consistency-Preserving Loss and Perceptual Contrast Stretching to Boost SSL-based Speech Enhancement
by: Khan, Muhammad Salman, et al.
Published: (2024)
by: Khan, Muhammad Salman, et al.
Published: (2024)
Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval
by: Deng, Yimin, et al.
Published: (2024)
by: Deng, Yimin, et al.
Published: (2024)
Reverberation-Robust Localization of Speakers Using Distinct Speech Onsets and Multi-channel Cross-Correlations
by: Lin, Shoufeng
Published: (2026)
by: Lin, Shoufeng
Published: (2026)
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech
by: Du, Chenpeng, et al.
Published: (2024)
by: Du, Chenpeng, et al.
Published: (2024)
DeCRED: Decoder-Centric Regularization for Encoder-Decoder Based Speech Recognition
by: Polok, Alexander, et al.
Published: (2025)
by: Polok, Alexander, et al.
Published: (2025)
Improving Automatic Speech Recognition with Decoder-Centric Regularisation in Encoder-Decoder Models
by: Polok, Alexander, et al.
Published: (2024)
by: Polok, Alexander, et al.
Published: (2024)
GESI: Gammachirp Envelope Similarity Index for Predicting Intelligibility of Simulated Hearing Loss Sounds
by: Yamamoto, Ayako, et al.
Published: (2023)
by: Yamamoto, Ayako, et al.
Published: (2023)
Enhancement of Dysarthric Speech Reconstruction by Contrastive Learning
by: Fatemeh, Keshvari, et al.
Published: (2024)
by: Fatemeh, Keshvari, et al.
Published: (2024)
Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
by: Shin, Ui-Hyeop, et al.
Published: (2024)
by: Shin, Ui-Hyeop, et al.
Published: (2024)
FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks
by: Lei, Tong, et al.
Published: (2025)
by: Lei, Tong, et al.
Published: (2025)
EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
by: Li, Pengcheng, et al.
Published: (2025)
by: Li, Pengcheng, et al.
Published: (2025)
Rethinking Flow and Diffusion Bridge Models for Speech Enhancement
by: Wang, Dahan, et al.
Published: (2026)
by: Wang, Dahan, et al.
Published: (2026)
VoCodec: An Efficient Lightweight Low-Bitrate Speech Codec
by: Yang, Leyan, et al.
Published: (2026)
by: Yang, Leyan, et al.
Published: (2026)
Modeling Multi-Level Hearing Loss for Speech Intelligibility Prediction
by: Zhou, Xiajie, et al.
Published: (2025)
by: Zhou, Xiajie, et al.
Published: (2025)
Speech Separation using Neural Audio Codecs with Embedding Loss
by: Yip, Jia Qi, et al.
Published: (2024)
by: Yip, Jia Qi, et al.
Published: (2024)
Streaming Decoder-Only Automatic Speech Recognition with Discrete Speech Units: A Pilot Study
by: Chen, Peikun, et al.
Published: (2024)
by: Chen, Peikun, et al.
Published: (2024)
Towards Fine-Grained and Multi-Granular Contrastive Language-Speech Pre-training
by: Yang, Yifan, et al.
Published: (2026)
by: Yang, Yifan, et al.
Published: (2026)
Quartered Chirp Spectral Envelope for Whispered vs Normal Speech Classification
by: Joysingh, S. Johanan, et al.
Published: (2024)
by: Joysingh, S. Johanan, et al.
Published: (2024)
Accelerating Autoregressive Speech Synthesis Inference With Speech Speculative Decoding
by: Lin, Zijian, et al.
Published: (2025)
by: Lin, Zijian, et al.
Published: (2025)
Large Language Model Guided Decoding for Self-Supervised Speech Recognition
by: Cohen, Eyal, et al.
Published: (2025)
by: Cohen, Eyal, et al.
Published: (2025)
Speech-Omni-Lite: Portable Speech Interfaces for Vision-Language Models
by: Tao, Dehua, et al.
Published: (2026)
by: Tao, Dehua, et al.
Published: (2026)
Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech
by: Wang, Hankun, et al.
Published: (2024)
by: Wang, Hankun, et al.
Published: (2024)
Audiobook-CC: Controllable Long-context Speech Generation for Multicast Audiobook
by: Liu, Min, et al.
Published: (2025)
by: Liu, Min, et al.
Published: (2025)
Subject Disentanglement Neural Network for Speech Envelope Reconstruction from EEG
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Performance Modeling for Correlation-based Neural Decoding of Auditory Attention to Speech
by: Geirnaert, Simon, et al.
Published: (2025)
by: Geirnaert, Simon, et al.
Published: (2025)
SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding
by: Wang, Hongbin, et al.
Published: (2025)
by: Wang, Hongbin, et al.
Published: (2025)
Deep Filter Estimation from Inter-Frame Correlations for Monaural Speech Dereverberation
by: Shin, Ui-Hyeop, et al.
Published: (2026)
by: Shin, Ui-Hyeop, et al.
Published: (2026)
Attention-Based Beamformer For Multi-Channel Speech Enhancement
by: Bai, Jinglin, et al.
Published: (2024)
by: Bai, Jinglin, et al.
Published: (2024)
FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition
by: Kim, Jongsuk, et al.
Published: (2025)
by: Kim, Jongsuk, et al.
Published: (2025)
Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition
by: Yang, Qingran, et al.
Published: (2026)
by: Yang, Qingran, et al.
Published: (2026)
Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
by: Jing, Xin, et al.
Published: (2024)
by: Jing, Xin, et al.
Published: (2024)
Mamba-based Decoder-Only Approach with Bidirectional Speech Modeling for Speech Recognition
by: Masuyama, Yoshiki, et al.
Published: (2024)
by: Masuyama, Yoshiki, et al.
Published: (2024)
Similar Items
-
Decoding Stimulus Reconstruction-Based Auditory Attention Robustly in Unbalanced EEG Datasets
by: Zhang, Yuanming, et al.
Published: (2026) -
Multi-class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum
by: Zhang, Yuanming, et al.
Published: (2024) -
Auditory Attention Decoding from Ear-EEG Signals: A Dataset with Dynamic Attention Switching and Rigorous Cross-Validation
by: Zhang, Yuanming, et al.
Published: (2025) -
A Lightweight Hybrid Dual Channel Speech Enhancement System under Low-SNR Conditions
by: Wang, Zheng, et al.
Published: (2025) -
Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring
by: Webber, Jacob J, et al.
Published: (2025)