:: Library Catalog

Omslagsbillede

Saved in:

Bibliografiske detaljer
Main Authors:	Ai, Zhiqi, Chen, Zhiyong, Xu, Shugong
Format:	Preprint
Udgivet:	2024
Fag:	Audio and Speech Processing Computation and Language Sound
Online adgang:	https://arxiv.org/abs/2406.07310
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

Lignende værker

Effective User-defined Keyword Spotting with Dual-stage Matching, Multi-modal Enrollment, and Continual Adaptation
af: Ai, Zhiqi, et al.
Udgivet: (2026)

PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting
af: Pan, Jianan, et al.
Udgivet: (2026)

ProKWS: Personalized Keyword Spotting via Collaborative Learning of Phonemes and Prosody
af: Pan, Jianan, et al.
Udgivet: (2026)

NTC-KWS: Noise-aware CTC for Robust Keyword Spotting
af: Xi, Yu, et al.
Udgivet: (2024)

Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
af: Chen, Zhiyong, et al.
Udgivet: (2024)

StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis
af: Chen, Zhiyong, et al.
Udgivet: (2024)

MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous Decoding
af: Xi, Yu, et al.
Udgivet: (2025)

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
af: Xi, Yu, et al.
Udgivet: (2024)

AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
af: Xiao, Yang, et al.
Udgivet: (2025)

EdgeSpot: Efficient and High-Performance Few-Shot Model for Keyword Spotting
af: Buyuksolak, Oguzhan, et al.
Udgivet: (2026)

LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting
af: Zhu, Pai, et al.
Udgivet: (2025)

GraphemeAug: A Systematic Approach to Synthesized Hard Negative Keyword Spotting Examples
af: Zhang, Harry, et al.
Udgivet: (2025)

Enhancing Few-shot Keyword Spotting Performance through Pre-Trained Self-supervised Speech Models
af: Gok, Alican, et al.
Udgivet: (2025)

Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
af: Yen, Hao, et al.
Udgivet: (2024)

ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword Spotting
af: Song, Zeyang, et al.
Udgivet: (2024)

AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting
af: Xiao, Yang, et al.
Udgivet: (2025)

WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing
af: Nakagome, Yu, et al.
Udgivet: (2025)

Multiple-Instance, Cascaded Classification for Keyword Spotting in Narrow-Band Audio
af: AbdulKader, Ahmad, et al.
Udgivet: (2017)

Query-by-Example Keyword Spotting Using Spectral-Temporal Graph Attentive Pooling and Multi-Task Learning
af: Wang, Zhenyu, et al.
Udgivet: (2024)

Effective Integration of KAN for Keyword Spotting
af: Xu, Anfeng, et al.
Udgivet: (2024)

Assessing the Impact of Anisotropy in Neural Representations of Speech: A Case Study on Keyword Spotting
af: Wisniewski, Guillaume, et al.
Udgivet: (2025)

Keyword Mamba: Spoken Keyword Spotting with State Space Models
af: Ding, Hanyu, et al.
Udgivet: (2025)

Phoneme-Level Contrastive Learning for User-Defined Keyword Spotting with Flexible Enrollment
af: Kewei, Li, et al.
Udgivet: (2024)

A Literature Review of Keyword Spotting Technologies for Urdu
af: Rizvi, Syed Muhammad Aqdas
Udgivet: (2024)

Multichannel Keyword Spotting for Noisy Conditions
af: Saladukha, Dzmitry, et al.
Udgivet: (2025)

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis
af: Guan, Wenhao, et al.
Udgivet: (2023)

Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting
af: Kashiwagi, Yosuke, et al.
Udgivet: (2024)

Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
af: Denisov, Pavel, et al.
Udgivet: (2024)

Romanization Encoding For Multilingual ASR
af: Ding, Wen, et al.
Udgivet: (2024)

Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context Modeling
af: Liu, Rui, et al.
Udgivet: (2024)

HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition
af: Dutta, Soumya, et al.
Udgivet: (2023)

A Language-Agnostic Hierarchical LoRA-MoE Architecture for CTC-based Multilingual ASR
af: Zheng, Yuang, et al.
Udgivet: (2026)

Exploring SSL Discrete Tokens for Multilingual ASR
af: Cui, Mingyu, et al.
Udgivet: (2024)

MultiMed: Multilingual Medical Speech Recognition via Attention Encoder Decoder
af: Le-Duc, Khai, et al.
Udgivet: (2024)

Multi-Teacher Language-Aware Knowledge Distillation for Multilingual Speech Emotion Recognition
af: Bijoy, Mehedi Hasan, et al.
Udgivet: (2025)

M3TCM: Multi-modal Multi-task Context Model for Utterance Classification in Motivational Interviews
af: Hossain, Sayed Muddashir, et al.
Udgivet: (2024)

Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
af: Xi, Yu, et al.
Udgivet: (2024)

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
af: Xi, Yu, et al.
Udgivet: (2024)

Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts
af: Ferraz, Thomas Palmeira, et al.
Udgivet: (2023)

Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-based Multilingual Pre-training
af: Dong, Lukuan, et al.
Udgivet: (2024)