Saved in:
| Main Authors: | Ai, Zhiqi, Chen, Zhiyong, Xu, Shugong |
|---|---|
| Format: | Preprint |
| Udgivet: |
2024
|
| Fag: | |
| Online adgang: | https://arxiv.org/abs/2406.07310 |
| Tags: |
Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!
|
Lignende værker
Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation
af: Ai, Zhiqi, et al.
Udgivet: (2026)
af: Ai, Zhiqi, et al.
Udgivet: (2026)
PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting
af: Pan, Jianan, et al.
Udgivet: (2026)
af: Pan, Jianan, et al.
Udgivet: (2026)
ProKWS: Personalized Keyword Spotting via Collaborative Learning of Phonemes and Prosody
af: Pan, Jianan, et al.
Udgivet: (2026)
af: Pan, Jianan, et al.
Udgivet: (2026)
NTC-KWS: Noise-aware CTC for Robust Keyword Spotting
af: Xi, Yu, et al.
Udgivet: (2024)
af: Xi, Yu, et al.
Udgivet: (2024)
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
af: Chen, Zhiyong, et al.
Udgivet: (2024)
af: Chen, Zhiyong, et al.
Udgivet: (2024)
StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis
af: Chen, Zhiyong, et al.
Udgivet: (2024)
af: Chen, Zhiyong, et al.
Udgivet: (2024)
MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous Decoding
af: Xi, Yu, et al.
Udgivet: (2025)
af: Xi, Yu, et al.
Udgivet: (2025)
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
af: Xi, Yu, et al.
Udgivet: (2024)
af: Xi, Yu, et al.
Udgivet: (2024)
AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
af: Xiao, Yang, et al.
Udgivet: (2025)
af: Xiao, Yang, et al.
Udgivet: (2025)
EdgeSpot: Efficient and High-Performance Few-Shot Model for Keyword Spotting
af: Buyuksolak, Oguzhan, et al.
Udgivet: (2026)
af: Buyuksolak, Oguzhan, et al.
Udgivet: (2026)
LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
af: Zhu, Pai, et al.
Udgivet: (2025)
af: Zhu, Pai, et al.
Udgivet: (2025)
GraphemeAug: A Systematic Approach to Synthesized Hard Negative Keyword Spotting Examples
af: Zhang, Harry, et al.
Udgivet: (2025)
af: Zhang, Harry, et al.
Udgivet: (2025)
Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models
af: Gok, Alican, et al.
Udgivet: (2025)
af: Gok, Alican, et al.
Udgivet: (2025)
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
af: Yen, Hao, et al.
Udgivet: (2024)
af: Yen, Hao, et al.
Udgivet: (2024)
ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting
af: Song, Zeyang, et al.
Udgivet: (2024)
af: Song, Zeyang, et al.
Udgivet: (2024)
AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting
af: Xiao, Yang, et al.
Udgivet: (2025)
af: Xiao, Yang, et al.
Udgivet: (2025)
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing
af: Nakagome, Yu, et al.
Udgivet: (2025)
af: Nakagome, Yu, et al.
Udgivet: (2025)
Multiple-Instance, Cascaded Classification for Keyword Spotting in Narrow-Band Audio
af: AbdulKader, Ahmad, et al.
Udgivet: (2017)
af: AbdulKader, Ahmad, et al.
Udgivet: (2017)
Query-by-Example Keyword Spotting Using Spectral-Temporal Graph Attentive Pooling and Multi-Task Learning
af: Wang, Zhenyu, et al.
Udgivet: (2024)
af: Wang, Zhenyu, et al.
Udgivet: (2024)
Effective Integration of KAN for Keyword Spotting
af: Xu, Anfeng, et al.
Udgivet: (2024)
af: Xu, Anfeng, et al.
Udgivet: (2024)
Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting
af: Wisniewski, Guillaume, et al.
Udgivet: (2025)
af: Wisniewski, Guillaume, et al.
Udgivet: (2025)
Keyword Mamba: Spoken Keyword Spotting with State Space Models
af: Ding, Hanyu, et al.
Udgivet: (2025)
af: Ding, Hanyu, et al.
Udgivet: (2025)
Phoneme-Level Contrastive Learning for User-Defined Keyword Spotting with Flexible Enrollment
af: Kewei, Li, et al.
Udgivet: (2024)
af: Kewei, Li, et al.
Udgivet: (2024)
A Literature Review of Keyword Spotting Technologies for Urdu
af: Rizvi, Syed Muhammad Aqdas
Udgivet: (2024)
af: Rizvi, Syed Muhammad Aqdas
Udgivet: (2024)
Multichannel Keyword Spotting for Noisy Conditions
af: Saladukha, Dzmitry, et al.
Udgivet: (2025)
af: Saladukha, Dzmitry, et al.
Udgivet: (2025)
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
af: Guan, Wenhao, et al.
Udgivet: (2023)
af: Guan, Wenhao, et al.
Udgivet: (2023)
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
af: Kashiwagi, Yosuke, et al.
Udgivet: (2024)
af: Kashiwagi, Yosuke, et al.
Udgivet: (2024)
Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
af: Denisov, Pavel, et al.
Udgivet: (2024)
af: Denisov, Pavel, et al.
Udgivet: (2024)
Romanization Encoding For Multilingual ASR
af: Ding, Wen, et al.
Udgivet: (2024)
af: Ding, Wen, et al.
Udgivet: (2024)
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling
af: Liu, Rui, et al.
Udgivet: (2024)
af: Liu, Rui, et al.
Udgivet: (2024)
HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
af: Dutta, Soumya, et al.
Udgivet: (2023)
af: Dutta, Soumya, et al.
Udgivet: (2023)
A Language-Agnostic Hierarchical LoRA-MoE Architecture for CTC-based Multilingual ASR
af: Zheng, Yuang, et al.
Udgivet: (2026)
af: Zheng, Yuang, et al.
Udgivet: (2026)
Exploring SSL Discrete Tokens for Multilingual ASR
af: Cui, Mingyu, et al.
Udgivet: (2024)
af: Cui, Mingyu, et al.
Udgivet: (2024)
MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
af: Le-Duc, Khai, et al.
Udgivet: (2024)
af: Le-Duc, Khai, et al.
Udgivet: (2024)
Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition
af: Bijoy, Mehedi Hasan, et al.
Udgivet: (2025)
af: Bijoy, Mehedi Hasan, et al.
Udgivet: (2025)
M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews
af: Hossain, Sayed Muddashir, et al.
Udgivet: (2024)
af: Hossain, Sayed Muddashir, et al.
Udgivet: (2024)
Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
af: Xi, Yu, et al.
Udgivet: (2024)
af: Xi, Yu, et al.
Udgivet: (2024)
Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
af: Xi, Yu, et al.
Udgivet: (2024)
af: Xi, Yu, et al.
Udgivet: (2024)
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts
af: Ferraz, Thomas Palmeira, et al.
Udgivet: (2023)
af: Ferraz, Thomas Palmeira, et al.
Udgivet: (2023)
Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training
af: Dong, Lukuan, et al.
Udgivet: (2024)
af: Dong, Lukuan, et al.
Udgivet: (2024)
Lignende værker
-
Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation
af: Ai, Zhiqi, et al.
Udgivet: (2026) -
PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting
af: Pan, Jianan, et al.
Udgivet: (2026) -
ProKWS: Personalized Keyword Spotting via Collaborative Learning of Phonemes and Prosody
af: Pan, Jianan, et al.
Udgivet: (2026) -
NTC-KWS: Noise-aware CTC for Robust Keyword Spotting
af: Xi, Yu, et al.
Udgivet: (2024) -
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
af: Chen, Zhiyong, et al.
Udgivet: (2024)