Guardado en:
| Autores principales: | Singh, Shruti, Singh, Muskaan, Kadyan, Virender |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2408.14991 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition
por: Yang, Zhengdong, et al.
Publicado: (2025)
por: Yang, Zhengdong, et al.
Publicado: (2025)
Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
por: Bhogale, Kaushal, et al.
Publicado: (2026)
por: Bhogale, Kaushal, et al.
Publicado: (2026)
Adapting Self-Supervised Speech Representations for Cross-lingual Dysarthria Detection in Parkinson's Disease
por: Hernandez, Abner, et al.
Publicado: (2026)
por: Hernandez, Abner, et al.
Publicado: (2026)
What Do Speech Foundation Models Not Learn About Speech?
por: Waheed, Abdul, et al.
Publicado: (2024)
por: Waheed, Abdul, et al.
Publicado: (2024)
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
por: Zhang, Zixing, et al.
Publicado: (2024)
por: Zhang, Zixing, et al.
Publicado: (2024)
Benchmarking Automatic Speech Recognition for Indian Languages in Agricultural Contexts
por: S, Chandrashekar M, et al.
Publicado: (2026)
por: S, Chandrashekar M, et al.
Publicado: (2026)
DIVINE: Coordinating Multimodal Disentangled Representations for Oro-Facial Neurological Disorder Assessment
por: Akhtar, Mohd Mujtaba, et al.
Publicado: (2026)
por: Akhtar, Mohd Mujtaba, et al.
Publicado: (2026)
Automatic Speech Recognition for Hindi
por: Saha, Anish, et al.
Publicado: (2024)
por: Saha, Anish, et al.
Publicado: (2024)
Leveraging Cross-Attention Transformer and Multi-Feature Fusion for Cross-Linguistic Speech Emotion Recognition
por: Zhao, Ruoyu, et al.
Publicado: (2025)
por: Zhao, Ruoyu, et al.
Publicado: (2025)
Speech Recognition Rescoring with Large Speech-Text Foundation Models
por: Shivakumar, Prashanth Gurunath, et al.
Publicado: (2024)
por: Shivakumar, Prashanth Gurunath, et al.
Publicado: (2024)
Joint Automatic Speech Recognition And Structure Learning For Better Speech Understanding
por: Hu, Jiliang, et al.
Publicado: (2025)
por: Hu, Jiliang, et al.
Publicado: (2025)
Inappropriate Pause Detection In Dysarthric Speech Using Large-Scale Speech Recognition
por: Lee, Jeehyun, et al.
Publicado: (2024)
por: Lee, Jeehyun, et al.
Publicado: (2024)
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition
por: Cornell, Samuele, et al.
Publicado: (2024)
por: Cornell, Samuele, et al.
Publicado: (2024)
Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
por: Vesterbacka, Leonora, et al.
Publicado: (2025)
por: Vesterbacka, Leonora, et al.
Publicado: (2025)
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
por: Wang, Yujin, et al.
Publicado: (2022)
por: Wang, Yujin, et al.
Publicado: (2022)
Dynamic Data Pruning for Automatic Speech Recognition
por: Xiao, Qiao, et al.
Publicado: (2024)
por: Xiao, Qiao, et al.
Publicado: (2024)
Contextualized Automatic Speech Recognition with Dynamic Vocabulary
por: Sudo, Yui, et al.
Publicado: (2024)
por: Sudo, Yui, et al.
Publicado: (2024)
Dialectal Coverage And Generalization in Arabic Speech Recognition
por: Djanibekov, Amirbek, et al.
Publicado: (2024)
por: Djanibekov, Amirbek, et al.
Publicado: (2024)
On the Contribution of Lexical Features to Speech Emotion Recognition
por: Combei, David
Publicado: (2025)
por: Combei, David
Publicado: (2025)
Unimodal Aggregation for CTC-based Speech Recognition
por: Fang, Ying, et al.
Publicado: (2023)
por: Fang, Ying, et al.
Publicado: (2023)
Continual Adaptation for Pacific Indigenous Speech Recognition
por: Xiao, Yang, et al.
Publicado: (2026)
por: Xiao, Yang, et al.
Publicado: (2026)
Improving Speech Emotion Recognition in Under-Resourced Languages via Speech-to-Speech Translation with Bootstrapping Data Selection
por: Lin, Hsi-Che, et al.
Publicado: (2024)
por: Lin, Hsi-Che, et al.
Publicado: (2024)
Enhancing Indonesian Automatic Speech Recognition: Evaluating Multilingual Models with Diverse Speech Variabilities
por: Adila, Aulia, et al.
Publicado: (2024)
por: Adila, Aulia, et al.
Publicado: (2024)
SMILE: Speech Meta In-Context Learning for Low-Resource Language Automatic Speech Recognition
por: Hsu, Ming-Hao, et al.
Publicado: (2024)
por: Hsu, Ming-Hao, et al.
Publicado: (2024)
STAB: Speech Tokenizer Assessment Benchmark
por: Vashishth, Shikhar, et al.
Publicado: (2024)
por: Vashishth, Shikhar, et al.
Publicado: (2024)
Are Paralinguistic Representations all that is needed for Speech Emotion Recognition?
por: Phukan, Orchid Chetia, et al.
Publicado: (2024)
por: Phukan, Orchid Chetia, et al.
Publicado: (2024)
Exploration of Adapter for Noise Robust Automatic Speech Recognition
por: Shi, Hao, et al.
Publicado: (2024)
por: Shi, Hao, et al.
Publicado: (2024)
Towards Unsupervised Speech Recognition Without Pronunciation Models
por: Ni, Junrui, et al.
Publicado: (2024)
por: Ni, Junrui, et al.
Publicado: (2024)
Sequential Editing for Lifelong Training of Speech Recognition Models
por: Kulshreshtha, Devang, et al.
Publicado: (2024)
por: Kulshreshtha, Devang, et al.
Publicado: (2024)
Automatic Speech Recognition for Biomedical Data in Bengali Language
por: Kabir, Shariar, et al.
Publicado: (2024)
por: Kabir, Shariar, et al.
Publicado: (2024)
Children's Speech Recognition through Discrete Token Enhancement
por: Sukhadia, Vrunda N., et al.
Publicado: (2024)
por: Sukhadia, Vrunda N., et al.
Publicado: (2024)
Weight Factorization and Centralization for Continual Learning in Speech Recognition
por: Ugan, Enes Yavuz, et al.
Publicado: (2025)
por: Ugan, Enes Yavuz, et al.
Publicado: (2025)
Benchmarking Automatic Speech Recognition Models for African Languages
por: Nahabwe, Alvin, et al.
Publicado: (2025)
por: Nahabwe, Alvin, et al.
Publicado: (2025)
Frontend Token Enhancement for Token-Based Speech Recognition
por: Ashihara, Takanori, et al.
Publicado: (2026)
por: Ashihara, Takanori, et al.
Publicado: (2026)
Retrieval-Augmented Speech Recognition Approach for Domain Challenges
por: Shen, Peng, et al.
Publicado: (2025)
por: Shen, Peng, et al.
Publicado: (2025)
Speaker-Aware Simulation Improves Conversational Speech Recognition
por: Gedeon, Máté, et al.
Publicado: (2026)
por: Gedeon, Máté, et al.
Publicado: (2026)
Exploring Gender Disparities in Automatic Speech Recognition Technology
por: ElGhazaly, Hend, et al.
Publicado: (2025)
por: ElGhazaly, Hend, et al.
Publicado: (2025)
Cross-lingual Text-To-Speech with Flow-based Voice Conversion for Improved Pronunciation
por: Ellinas, Nikolaos, et al.
Publicado: (2022)
por: Ellinas, Nikolaos, et al.
Publicado: (2022)
Improving Accented Speech Recognition using Data Augmentation based on Unsupervised Text-to-Speech Synthesis
por: Do, Cong-Thanh, et al.
Publicado: (2024)
por: Do, Cong-Thanh, et al.
Publicado: (2024)
Transformers in Speech Processing: A Survey
por: Latif, Siddique, et al.
Publicado: (2023)
por: Latif, Siddique, et al.
Publicado: (2023)
Ejemplares similares
-
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition
por: Yang, Zhengdong, et al.
Publicado: (2025) -
Voice of India: A Large-Scale Benchmark for Real-World Speech Recognition in India
por: Bhogale, Kaushal, et al.
Publicado: (2026) -
Adapting Self-Supervised Speech Representations for Cross-lingual Dysarthria Detection in Parkinson's Disease
por: Hernandez, Abner, et al.
Publicado: (2026) -
What Do Speech Foundation Models Not Learn About Speech?
por: Waheed, Abdul, et al.
Publicado: (2024) -
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
por: Zhang, Zixing, et al.
Publicado: (2024)