Saved in:
| Main Authors: | Bahadi, Soufiyan, Plourde, Eric, Rouat, Jean |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.06989 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification
by: Bahadi, Soufiyan, et al.
Published: (2024)
by: Bahadi, Soufiyan, et al.
Published: (2024)
Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025)
by: Yuan, Xihao, et al.
Published: (2025)
Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025)
by: Cheng, Jiaming, et al.
Published: (2025)
Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection
by: Mariotte, Théo, et al.
Published: (2024)
by: Mariotte, Théo, et al.
Published: (2024)
Frequency-mix Knowledge Distillation for Fake Speech Detection
by: Fan, Cunhang, et al.
Published: (2024)
by: Fan, Cunhang, et al.
Published: (2024)
URGENT-PK: Perceptually-Aligned Ranking Model Designed for Speech Enhancement Competition
by: Wang, Jiahe, et al.
Published: (2025)
by: Wang, Jiahe, et al.
Published: (2025)
NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping
by: Büthe, Jan, et al.
Published: (2023)
by: Büthe, Jan, et al.
Published: (2023)
Asymmetric Encoder-Decoder Based on Time-Frequency Correlation for Speech Separation
by: Shin, Ui-Hyeop, et al.
Published: (2026)
by: Shin, Ui-Hyeop, et al.
Published: (2026)
FRCRN: Boosting Feature Representation using Frequency Recurrence for Monaural Speech Enhancement
by: Zhao, Shengkui, et al.
Published: (2022)
by: Zhao, Shengkui, et al.
Published: (2022)
Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition
by: Zhao, Jiaqi, et al.
Published: (2024)
by: Zhao, Jiaqi, et al.
Published: (2024)
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
by: Imamura, Kanami, et al.
Published: (2023)
by: Imamura, Kanami, et al.
Published: (2023)
Adaptive Convolution for CNN-based Speech Enhancement Models
by: Wang, Dahan, et al.
Published: (2025)
by: Wang, Dahan, et al.
Published: (2025)
Central Kurdish Text-to-Speech Synthesis with Novel End-to-End Transformer Training
by: Ahmad, Hawraz A., et al.
Published: (2024)
by: Ahmad, Hawraz A., et al.
Published: (2024)
A Phoneme-Scale Assessment of Multichannel Speech Enhancement Algorithms
by: Monir, Nasser-Eddine, et al.
Published: (2024)
by: Monir, Nasser-Eddine, et al.
Published: (2024)
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph
by: Gao, Yingxue, et al.
Published: (2024)
by: Gao, Yingxue, et al.
Published: (2024)
Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event Detection
by: Nam, Hyeonuk, et al.
Published: (2024)
by: Nam, Hyeonuk, et al.
Published: (2024)
SpeechRefiner: Towards Perceptual Quality Refinement for Front-End Algorithms
by: Li, Sirui, et al.
Published: (2025)
by: Li, Sirui, et al.
Published: (2025)
Evaluating Multichannel Speech Enhancement Algorithms at the Phoneme Scale Across Genders
by: Monir, Nasser-Eddine, et al.
Published: (2025)
by: Monir, Nasser-Eddine, et al.
Published: (2025)
TF-Mamba: A Time-Frequency Network for Sound Source Localization
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Combined Generative and Predictive Modeling for Speech Super-resolution
by: Wang, Heming, et al.
Published: (2024)
by: Wang, Heming, et al.
Published: (2024)
NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers
by: Park, Nohil, et al.
Published: (2024)
by: Park, Nohil, et al.
Published: (2024)
Robust Localization of Partially Fake Speech: Metrics and Out-of-Domain Evaluation
by: Luong, Hieu-Thi, et al.
Published: (2025)
by: Luong, Hieu-Thi, et al.
Published: (2025)
LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2025)
by: Wang, Junyu, et al.
Published: (2025)
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
by: Saijo, Kohei, et al.
Published: (2024)
by: Saijo, Kohei, et al.
Published: (2024)
Can LLMs Help Localize Fake Words in Partially Fake Speech?
by: Zhang, Lin, et al.
Published: (2026)
by: Zhang, Lin, et al.
Published: (2026)
DroFiT: A Lightweight Band-fused Frequency Attention Toward Real-time UAV Speech Enhancement
by: Lee, Jeongmin, et al.
Published: (2025)
by: Lee, Jeongmin, et al.
Published: (2025)
Adaptive Data Augmentation with NaturalSpeech3 for Far-field Speaker Verification
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
RADE: A Neural Codec for Transmitting Speech over HF Radio Channels
by: Rowe, David, et al.
Published: (2025)
by: Rowe, David, et al.
Published: (2025)
PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models
by: Yang, Runyan, et al.
Published: (2024)
by: Yang, Runyan, et al.
Published: (2024)
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
by: Hjuler, Maja J., et al.
Published: (2025)
by: Hjuler, Maja J., et al.
Published: (2025)
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
by: Imamura, Kanami, et al.
Published: (2025)
by: Imamura, Kanami, et al.
Published: (2025)
SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation
by: Huang, Ziling, et al.
Published: (2025)
by: Huang, Ziling, et al.
Published: (2025)
SEMamba++: A General Speech Restoration Framework Leveraging Global, Local, and Periodic Spectral Patterns
by: Lee, Yongjoon, et al.
Published: (2026)
by: Lee, Yongjoon, et al.
Published: (2026)
IKFST: IOO and KOO Algorithms for Accelerated and Precise WFST-based End-to-End Automatic Speech Recognition
by: Zhuang, Zhuoran, et al.
Published: (2026)
by: Zhuang, Zhuoran, et al.
Published: (2026)
Spatial Reconstructed Local Attention Res2Net with F0 Subband for Fake Speech Detection
by: Fan, Cunhang, et al.
Published: (2023)
by: Fan, Cunhang, et al.
Published: (2023)
RealMAN: A Real-Recorded and Annotated Microphone Array Dataset for Dynamic Speech Enhancement and Localization
by: Yang, Bing, et al.
Published: (2024)
by: Yang, Bing, et al.
Published: (2024)
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis
by: Yang, Qian, et al.
Published: (2024)
by: Yang, Qian, et al.
Published: (2024)
Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding
by: Zheng, Rui-Chen, et al.
Published: (2025)
by: Zheng, Rui-Chen, et al.
Published: (2025)
VoiceGuider: Enhancing Out-of-Domain Performance in Parameter-Efficient Speaker-Adaptive Text-to-Speech via Autoguidance
by: Yeom, Jiheum, et al.
Published: (2024)
by: Yeom, Jiheum, et al.
Published: (2024)
GLOBE: A High-quality English Corpus with Global Accents for Zero-shot Speaker Adaptive Text-to-Speech
by: Wang, Wenbin, et al.
Published: (2024)
by: Wang, Wenbin, et al.
Published: (2024)
Similar Items
-
Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification
by: Bahadi, Soufiyan, et al.
Published: (2024) -
Dynamic Frequency-Adaptive Knowledge Distillation for Speech Enhancement
by: Yuan, Xihao, et al.
Published: (2025) -
Leveraging Local and Global Knowledge Integration with Time-Frequency Calibrated Distillation for Speech Enhancement
by: Cheng, Jiaming, et al.
Published: (2025) -
Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection
by: Mariotte, Théo, et al.
Published: (2024) -
Frequency-mix Knowledge Distillation for Fake Speech Detection
by: Fan, Cunhang, et al.
Published: (2024)