Saved in:
| Main Authors: | Phuong, Tuan Dat, Truong, Duc-Tuan, Hoang, Long-Vu, Thu, Trang Nguyen Thi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04702 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models
by: Phuong, Tuan Dat, et al.
Published: (2025)
by: Phuong, Tuan Dat, et al.
Published: (2025)
Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
by: Truong, Duc-Tuan, et al.
Published: (2024)
by: Truong, Duc-Tuan, et al.
Published: (2024)
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition
by: Vu, Hoang Long, et al.
Published: (2024)
by: Vu, Hoang Long, et al.
Published: (2024)
Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
XLSR-Kanformer: A KAN-Intergrated model for Synthetic Speech Detection
by: Dat, Phuong Tuan, et al.
Published: (2025)
by: Dat, Phuong Tuan, et al.
Published: (2025)
Acoustic scattering AI for non-invasive object classifications: A case study on hair assessment
by: Hoang, Long-Vu, et al.
Published: (2025)
by: Hoang, Long-Vu, et al.
Published: (2025)
QAMO: Quality-aware Multi-centroid One-class Learning For Speech Deepfake Detection
by: Truong, Duc-Tuan, et al.
Published: (2025)
by: Truong, Duc-Tuan, et al.
Published: (2025)
Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection
by: Truong, Duc-Tuan, et al.
Published: (2025)
by: Truong, Duc-Tuan, et al.
Published: (2025)
Continuous Learning of Transformer-based Audio Deepfake Detection
by: Le, Tuan Duy Nguyen, et al.
Published: (2024)
by: Le, Tuan Duy Nguyen, et al.
Published: (2024)
AsyncSwitch: Asynchronous Text-Speech Adaptation for Code-Switched ASR
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
AdaCS: Adaptive Normalization for Enhanced Code-Switching ASR
by: Chu, The Chuong, et al.
Published: (2025)
by: Chu, The Chuong, et al.
Published: (2025)
MAGE: A Coarse-to-Fine Speech Enhancer with Masked Generative Model
by: Pham, The Hieu, et al.
Published: (2025)
by: Pham, The Hieu, et al.
Published: (2025)
Zero-Shot Text-to-Speech for Vietnamese
by: Vu, Thi, et al.
Published: (2025)
by: Vu, Thi, et al.
Published: (2025)
Room Impulse Responses help attackers to evade Deep Fake Detection
by: Luong, Hieu-Thi, et al.
Published: (2024)
by: Luong, Hieu-Thi, et al.
Published: (2024)
A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection
by: Pham, Lam, et al.
Published: (2024)
by: Pham, Lam, et al.
Published: (2024)
Environmental Sound Deepfake Detection Using Deep-Learning Framework
by: Pham, Lam, et al.
Published: (2026)
by: Pham, Lam, et al.
Published: (2026)
Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach
by: Tu, Huu Tuong, et al.
Published: (2025)
by: Tu, Huu Tuong, et al.
Published: (2025)
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
by: Le, Khanh, et al.
Published: (2025)
by: Le, Khanh, et al.
Published: (2025)
Can we train ASR systems on Code-switch without real code-switch data? Case study for Singapore's languages
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
SegAug: CTC-Aligned Segmented Augmentation For Robust RNN-Transducer Based Speech Recognition
by: Le, Khanh, et al.
Published: (2025)
by: Le, Khanh, et al.
Published: (2025)
A General Model for Deepfake Speech Detection: Diverse Bonafide Resources or Diverse AI-Based Generators
by: Pham, Lam, et al.
Published: (2026)
by: Pham, Lam, et al.
Published: (2026)
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
by: Liu, Tianchi, et al.
Published: (2025)
by: Liu, Tianchi, et al.
Published: (2025)
MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation
by: Le-Duc, Khai, et al.
Published: (2025)
by: Le-Duc, Khai, et al.
Published: (2025)
Toward Fine-Grained Speech Inpainting Forensics:A Dataset, Method, and Metric for Multi-Region Tampering Localization
by: Vu, Tung, et al.
Published: (2026)
by: Vu, Tung, et al.
Published: (2026)
Hierarchical Decoding for Discrete Speech Synthesis with Multi-Resolution Spoof Detection
by: Zhao, Junchuan, et al.
Published: (2026)
by: Zhao, Junchuan, et al.
Published: (2026)
Attention-based Mixture of Experts for Robust Speech Deepfake Detection
by: Negroni, Viola, et al.
Published: (2025)
by: Negroni, Viola, et al.
Published: (2025)
O_O-VC: Synthetic Data-Driven One-to-One Alignment for Any-to-Any Voice Conversion
by: Tu, Huu Tuong, et al.
Published: (2025)
by: Tu, Huu Tuong, et al.
Published: (2025)
Stream-based Active Learning for Anomalous Sound Detection in Machine Condition Monitoring
by: Ho, Tuan Vu, et al.
Published: (2024)
by: Ho, Tuan Vu, et al.
Published: (2024)
Assessing the Impact of Speaker Identity in Speech Spoofing Detection
by: Dao, Anh-Tuan, et al.
Published: (2026)
by: Dao, Anh-Tuan, et al.
Published: (2026)
Speechless: Speech Instruction Training Without Speech for Low Resource Languages
by: Dao, Alan, et al.
Published: (2025)
by: Dao, Alan, et al.
Published: (2025)
Deepfake Audio Detection Using Spectrogram-based Feature and Ensemble of Deep Learning Models
by: Pham, Lam, et al.
Published: (2024)
by: Pham, Lam, et al.
Published: (2024)
Frame-level Temporal Difference Learning for Partial Deepfake Speech Detection
by: Li, Menglu, et al.
Published: (2025)
by: Li, Menglu, et al.
Published: (2025)
Multi-Task Transformer for Explainable Speech Deepfake Detection via Formant Modeling
by: Negroni, Viola, et al.
Published: (2026)
by: Negroni, Viola, et al.
Published: (2026)
Xi+: Uncertainty Supervision for Robust Speaker Embedding
by: Li, Junjie, et al.
Published: (2025)
by: Li, Junjie, et al.
Published: (2025)
Real-time Speech Summarization for Medical Conversations
by: Le-Duc, Khai, et al.
Published: (2024)
by: Le-Duc, Khai, et al.
Published: (2024)
Fake Speech Wild: Detecting Deepfake Speech on Social Media Platform
by: Xie, Yuankun, et al.
Published: (2025)
by: Xie, Yuankun, et al.
Published: (2025)
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
by: Le-Duc, Khai, et al.
Published: (2024)
by: Le-Duc, Khai, et al.
Published: (2024)
Towards Scalable AASIST: Refining Graph Attention for Speech Deepfake Detection
by: Viakhirev, Ivan, et al.
Published: (2025)
by: Viakhirev, Ivan, et al.
Published: (2025)
Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification
by: Truong, Duc-Tuan, et al.
Published: (2023)
by: Truong, Duc-Tuan, et al.
Published: (2023)
SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
by: Zhu, Yi, et al.
Published: (2024)
by: Zhu, Yi, et al.
Published: (2024)
Similar Items
-
Pushing the Performance of Synthetic Speech Detection with Kolmogorov-Arnold Networks and Self-Supervised Learning Models
by: Phuong, Tuan Dat, et al.
Published: (2025) -
Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
by: Truong, Duc-Tuan, et al.
Published: (2024) -
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition
by: Vu, Hoang Long, et al.
Published: (2024) -
Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems
by: Nguyen, Tuan, et al.
Published: (2025) -
XLSR-Kanformer: A KAN-Intergrated model for Synthetic Speech Detection
by: Dat, Phuong Tuan, et al.
Published: (2025)