Saved in:
| Main Authors: | Chen, Jun, Hu, Shichao, Lin, Jiuxin, Li, Wenjie, Zhang, Zihan, Li, Xingchen, Liu, JinJiang, Xiao, Longshuai, Weng, Chao, Xie, Lei, Wu, Zhiyong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.10687 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MeanFlowSE: One-Step Generative Speech Enhancement via MeanFlow
by: Zhu, Yike, et al.
Published: (2025)
by: Zhu, Yike, et al.
Published: (2025)
DualSep: A Light-weight dual-encoder convolutional recurrent network for real-time in-car speech separation
by: Wang, Ziqian, et al.
Published: (2024)
by: Wang, Ziqian, et al.
Published: (2024)
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
by: Kang, Boyi, et al.
Published: (2025)
by: Kang, Boyi, et al.
Published: (2025)
CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-Car Speech Separation with Distributed Heterogeneous Arrays
by: Han, Runduo, et al.
Published: (2025)
by: Han, Runduo, et al.
Published: (2025)
AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition
by: Dai, Yuhang, et al.
Published: (2025)
by: Dai, Yuhang, et al.
Published: (2025)
A Fast and Lightweight Model for Causal Audio-Visual Speech Separation
by: Sang, Wendi, et al.
Published: (2025)
by: Sang, Wendi, et al.
Published: (2025)
Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
Advances in Speech Separation: Techniques, Challenges, and Future Trends
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
A Lightweight and Real-Time Binaural Speech Enhancement Model with Spatial Cues Preservation
by: Wang, Jingyuan, et al.
Published: (2024)
by: Wang, Jingyuan, et al.
Published: (2024)
Study of Lightweight Transformer Architectures for Single-Channel Speech Enhancement
by: Zhao, Haixin, et al.
Published: (2025)
by: Zhao, Haixin, et al.
Published: (2025)
Llasa+: Free Lunch for Accelerated and Streaming Llama-Based Speech Synthesis
by: Tian, Wenjie, et al.
Published: (2025)
by: Tian, Wenjie, et al.
Published: (2025)
KALL-E:Autoregressive Speech Synthesis with Next-Distribution Prediction
by: Xia, Kangxiang, et al.
Published: (2024)
by: Xia, Kangxiang, et al.
Published: (2024)
SLM-SS: Speech Language Model for Generative Speech Separation
by: Li, Tianhua, et al.
Published: (2026)
by: Li, Tianhua, et al.
Published: (2026)
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)
by: Li, Guinan, et al.
Published: (2024)
RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
by: Xie, Yuankun, et al.
Published: (2025)
by: Xie, Yuankun, et al.
Published: (2025)
Optimizing Neural Architectures for Hindi Speech Separation and Enhancement in Noisy Environments
by: Ramamoorthy, Arnav
Published: (2025)
by: Ramamoorthy, Arnav
Published: (2025)
SenSE: Semantic-Aware High-Fidelity Universal Speech Enhancement
by: Li, Xingchen, et al.
Published: (2025)
by: Li, Xingchen, et al.
Published: (2025)
SlimSpeech: Lightweight and Efficient Text-to-Speech with Slim Rectified Flow
by: Wang, Kaidi, et al.
Published: (2025)
by: Wang, Kaidi, et al.
Published: (2025)
CapTalk: Unified Voice Design for Single-Utterance and Dialogue Speech Generation
by: Su, Xiaosu, et al.
Published: (2026)
by: Su, Xiaosu, et al.
Published: (2026)
FleSpeech: Flexibly Controllable Speech Generation with Various Prompts
by: Li, Hanzhao, et al.
Published: (2025)
by: Li, Hanzhao, et al.
Published: (2025)
DialoSpeech: Dual-Speaker Dialogue Generation with LLM and Flow Matching
by: Xie, Hanke, et al.
Published: (2025)
by: Xie, Hanke, et al.
Published: (2025)
From Coarse to Fine: Recursive Audio-Visual Semantic Enhancement for Speech Separation
by: Xue, Ke, et al.
Published: (2025)
by: Xue, Ke, et al.
Published: (2025)
Distil-DCCRN: A Small-footprint DCCRN Leveraging Feature-based Knowledge Distillation in Speech Enhancement
by: Han, Runduo, et al.
Published: (2024)
by: Han, Runduo, et al.
Published: (2024)
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis
by: Gu, Yu, et al.
Published: (2024)
by: Gu, Yu, et al.
Published: (2024)
TF-CorrNet: Leveraging Spatial Correlation for Continuous Speech Separation
by: Shin, Ui-Hyeop, et al.
Published: (2025)
by: Shin, Ui-Hyeop, et al.
Published: (2025)
Time-Frequency-Based Attention Cache Memory Model for Real-Time Speech Separation
by: Chen, Guo, et al.
Published: (2025)
by: Chen, Guo, et al.
Published: (2025)
Neural personal sound zones with flexible bright zone control
by: Zhu, Wenye, et al.
Published: (2025)
by: Zhu, Wenye, et al.
Published: (2025)
AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis
by: Luo, Dan, et al.
Published: (2025)
by: Luo, Dan, et al.
Published: (2025)
dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition
by: Tian, Wenjie, et al.
Published: (2026)
by: Tian, Wenjie, et al.
Published: (2026)
An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
by: Li, Kai, et al.
Published: (2022)
by: Li, Kai, et al.
Published: (2022)
A Lightweight Fourier-based Network for Binaural Speech Enhancement with Spatial Cue Preservation
by: Lu, Xikun, et al.
Published: (2025)
by: Lu, Xikun, et al.
Published: (2025)
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
by: Liu, Tianchi, et al.
Published: (2025)
by: Liu, Tianchi, et al.
Published: (2025)
How Well Do Current Speech Deepfake Detection Methods Generalize to the Real World?
by: Li, Daixian, et al.
Published: (2026)
by: Li, Daixian, et al.
Published: (2026)
Adaptive Data Augmentation with NaturalSpeech3 for Far-field Speaker Verification
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Leveraging Spatial Cues from Cochlear Implant Microphones to Efficiently Enhance Speech Separation in Real-World Listening Scenes
by: Olalere, Feyisayo, et al.
Published: (2025)
by: Olalere, Feyisayo, et al.
Published: (2025)
Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving
by: Xie, Jingran, et al.
Published: (2025)
by: Xie, Jingran, et al.
Published: (2025)
In This Environment, As That Speaker: A Text-Driven Framework for Multi-Attribute Speech Conversion
by: Jin, Jiawei, et al.
Published: (2025)
by: Jin, Jiawei, et al.
Published: (2025)
TF-MLPNet: Tiny Real-Time Neural Speech Separation
by: Itani, Malek, et al.
Published: (2025)
by: Itani, Malek, et al.
Published: (2025)
Similar Items
-
MeanFlowSE: One-Step Generative Speech Enhancement via MeanFlow
by: Zhu, Yike, et al.
Published: (2025) -
DualSep: A Light-weight dual-encoder convolutional recurrent network for real-time in-car speech separation
by: Wang, Ziqian, et al.
Published: (2024) -
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
by: Kang, Boyi, et al.
Published: (2025) -
CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-Car Speech Separation with Distributed Heterogeneous Arrays
by: Han, Runduo, et al.
Published: (2025) -
AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition
by: Dai, Yuhang, et al.
Published: (2025)