:: Library Catalog

Obálka

Uloženo v:

Podrobná bibliografie
Hlavní autoři:	Huang, Jiawen, Benetos, Emmanouil
Médium:	Preprint
Vydáno:	2024
Témata:	Audio and Speech Processing Computation and Language Sound
On-line přístup:	https://arxiv.org/abs/2406.17618
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Podobné jednotky

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Autor: Zhuo, Le, a další
Vydáno: (2023)

Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss
Autor: Huang, Jiawen, a další
Vydáno: (2025)

Classification of Spontaneous and Scripted Speech for Multilingual Audio
Autor: Elisha, Shahar, a další
Vydáno: (2024)

RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection
Autor: Chang, Sungkyun, a další
Vydáno: (2025)

Joint Training And Decoding for Multilingual End-to-End Simultaneous Speech Translation
Autor: Huang, Wuwei, a další
Vydáno: (2025)

Acoustically Precise Hesitation Tagging Is Essential for End-to-End Verbatim Transcription Systems
Autor: Lin, Jhen-Ke, a další
Vydáno: (2025)

Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
Autor: Agro, Maha Tufail, a další
Vydáno: (2025)

A Data-Driven Analysis of Robust Automatic Piano Transcription
Autor: Edwards, Drew, a další
Vydáno: (2024)

Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech
Autor: Valente, Martina, a další
Vydáno: (2024)

GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot
Autor: Zeng, Aohan, a další
Vydáno: (2024)

Lyrics Transcription for Humans: A Readability-Aware Benchmark
Autor: Cífka, Ondřej, a další
Vydáno: (2024)

An End-to-End Speech Summarization Using Large Language Model
Autor: Shang, Hengchao, a další
Vydáno: (2024)

Contextualized End-to-end Automatic Speech Recognition with Intermediate Biasing Loss
Autor: Shakeel, Muhammad, a další
Vydáno: (2024)

Representation Purification for End-to-End Speech Translation
Autor: Zhang, Chengwei, a další
Vydáno: (2024)

Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper
Autor: Syed, Jaza, a další
Vydáno: (2025)

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Autor: Huang, Ailin, a další
Vydáno: (2025)

AdaST: Dynamically Adapting Encoder States in the Decoder for End-to-End Speech-to-Text Translation
Autor: Huang, Wuwei, a další
Vydáno: (2025)

End-to-End Speech-to-Text Translation: A Survey
Autor: Sethiya, Nivedita, a další
Vydáno: (2023)

REFFLY: Melody-Constrained Lyrics Editing Model
Autor: Zhao, Songyan, a další
Vydáno: (2024)

Beyond Binary: Multiclass Paraphasia Detection with Generative Pretrained Transformers and End-to-End Models
Autor: Perez, Matthew, a další
Vydáno: (2024)

VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
Autor: Cui, Wenqian, a další
Vydáno: (2025)

Survey of End-to-End Multi-Speaker Automatic Speech Recognition for Monaural Audio
Autor: He, Xinlu, a další
Vydáno: (2025)

Towards End-to-End Training of Automatic Speech Recognition for Nigerian Pidgin
Autor: Rufai, Amina Mardiyyah, a další
Vydáno: (2020)

An investigation of phrase break prediction in an End-to-End TTS system
Autor: Vadapalli, Anandaswarup
Vydáno: (2023)

Fotheidil: an Automatic Transcription System for the Irish Language
Autor: Lonergan, Liam, a další
Vydáno: (2024)

Baichuan-Audio: A Unified Framework for End-to-End Speech Interaction
Autor: Li, Tianpeng, a další
Vydáno: (2025)

Scaling and Prompting for Improved End-to-End Spoken Grammatical Error Correction
Autor: Qian, Mengjie, a další
Vydáno: (2025)

Alternating Weak Triphone/BPE Alignment Supervision from Hybrid Model Improves End-to-End ASR
Autor: Jiang, Jintao, a další
Vydáno: (2024)

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models
Autor: Prabhavalkar, Rohit, a další
Vydáno: (2024)

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition
Autor: Higuchi, Yosuke, a další
Vydáno: (2023)

PRoDeliberation: Parallel Robust Deliberation for End-to-End Spoken Language Understanding
Autor: Le, Trang, a další
Vydáno: (2024)

Leveraging Synthetic Audio Data for End-to-End Low-Resource Speech Translation
Autor: Moslem, Yasmin
Vydáno: (2024)

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
Autor: Kocour, Martin, a další
Vydáno: (2025)

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent
Autor: Cheng, Shanbo, a další
Vydáno: (2024)

Towards an End-to-End Framework for Invasive Brain Signal Decoding with Large Language Models
Autor: Feng, Sheng, a další
Vydáno: (2024)

Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax
Autor: Patil, Aditya, a další
Vydáno: (2024)

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation
Autor: Wang, Peidong, a další
Vydáno: (2024)

Post-decoder Biasing for End-to-End Speech Recognition of Multi-turn Medical Interview
Autor: Liu, Heyang, a další
Vydáno: (2024)

Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems
Autor: Arora, Siddhant, a další
Vydáno: (2025)

Predictive Speech Recognition and End-of-Utterance Detection Towards Spoken Dialog Systems
Autor: Zink, Oswald, a další
Vydáno: (2024)