Guardado en:
| Autores principales: | Myrgyyassov, Alisher, Wang, Bruce Xiao, Sun, Yu, Huang, Shuming, Song, Zhen, Wong, Min Ney, Zheng, Yongping |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2603.03350 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example
por: Ghosh, Suhita, et al.
Publicado: (2024)
por: Ghosh, Suhita, et al.
Publicado: (2024)
Temporal Feature Learning in Weakly Labelled Bioacoustic Cetacean Datasets via a Variational Autoencoder and Temporal Convolutional Network: An Interdisciplinary Approach
por: Fonollosa, Laia Garrobé, et al.
Publicado: (2024)
por: Fonollosa, Laia Garrobé, et al.
Publicado: (2024)
Computational bioacoustics with deep learning: a review and roadmap
por: Stowell, Dan
Publicado: (2021)
por: Stowell, Dan
Publicado: (2021)
Adaptive Representations of Sound for Automatic Insect Recognition
por: Faiß, Marius, et al.
Publicado: (2023)
por: Faiß, Marius, et al.
Publicado: (2023)
Learning to detect an animal sound from five examples
por: Nolasco, Inês, et al.
Publicado: (2023)
por: Nolasco, Inês, et al.
Publicado: (2023)
Fish Tracking, Counting, and Behaviour Analysis in Digital Aquaculture: A Comprehensive Survey
por: Cui, Meng, et al.
Publicado: (2024)
por: Cui, Meng, et al.
Publicado: (2024)
Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
por: Geldenhuys, Christiaan M., et al.
Publicado: (2024)
por: Geldenhuys, Christiaan M., et al.
Publicado: (2024)
Cochlear Wave Propagation and Dynamics in the Human Base and Apex: Model-Based Estimates from Noninvasive Measurements
por: Alkhairy, Samiya A
Publicado: (2024)
por: Alkhairy, Samiya A
Publicado: (2024)
Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases
por: Zhang, Pengfei, et al.
Publicado: (2024)
por: Zhang, Pengfei, et al.
Publicado: (2024)
A Classification Benchmark for Artificial Intelligence Detection of Laryngeal Cancer from Patient Voice
por: Paterson, Mary, et al.
Publicado: (2024)
por: Paterson, Mary, et al.
Publicado: (2024)
Which Augmentation Should I Use? An Empirical Investigation of Augmentations for Self-Supervised Phonocardiogram Representation Learning
por: Ballas, Aristotelis, et al.
Publicado: (2023)
por: Ballas, Aristotelis, et al.
Publicado: (2023)
Automatic detection of Mild Cognitive Impairment using high-dimensional acoustic features in spontaneous speech
por: Zhang, Cong, et al.
Publicado: (2024)
por: Zhang, Cong, et al.
Publicado: (2024)
Screening method for early dementia using sound objects as voice biomarkers
por: Pluta, Adam, et al.
Publicado: (2024)
por: Pluta, Adam, et al.
Publicado: (2024)
Foundation Models for Bioacoustics -- a Comparative Review
por: Schwinger, Raphael, et al.
Publicado: (2025)
por: Schwinger, Raphael, et al.
Publicado: (2025)
From Birdsong to Rumbles: Classifying Elephant Calls with Out-of-Species Embeddings
por: Geldenhuys, Christiaan M., et al.
Publicado: (2026)
por: Geldenhuys, Christiaan M., et al.
Publicado: (2026)
Prosody of speech production in latent post-stroke aphasia
por: Zhang, Cong, et al.
Publicado: (2024)
por: Zhang, Cong, et al.
Publicado: (2024)
All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic Data
por: Navine, Amanda K., et al.
Publicado: (2024)
por: Navine, Amanda K., et al.
Publicado: (2024)
Multi Modal Information Fusion of Acoustic and Linguistic Data for Decoding Dairy Cow Vocalizations in Animal Welfare Assessment
por: Jobarteh, Bubacarr, et al.
Publicado: (2024)
por: Jobarteh, Bubacarr, et al.
Publicado: (2024)
animal2vec and MeerKAT: A self-supervised transformer for rare-event raw audio input and a large-scale reference dataset for bioacoustics
por: Schäfer-Zimmermann, Julian C., et al.
Publicado: (2024)
por: Schäfer-Zimmermann, Julian C., et al.
Publicado: (2024)
Chunked Attention-based Encoder-Decoder Model for Streaming Speech Recognition
por: Zeineldeen, Mohammad, et al.
Publicado: (2023)
por: Zeineldeen, Mohammad, et al.
Publicado: (2023)
Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study
por: Yang, Zijian, et al.
Publicado: (2026)
por: Yang, Zijian, et al.
Publicado: (2026)
Right Label Context in End-to-End Training of Time-Synchronous ASR Models
por: Raissi, Tina, et al.
Publicado: (2025)
por: Raissi, Tina, et al.
Publicado: (2025)
Atrial Fibrillation Detection System via Acoustic Sensing for Mobile Phones
por: Liu, Xuanyu, et al.
Publicado: (2024)
por: Liu, Xuanyu, et al.
Publicado: (2024)
WhaleVAD-BPN: Improving Baleen Whale Call Detection with Boundary Proposal Networks and Post-processing Optimisation
por: Geldenhuys, Christiaan M., et al.
Publicado: (2025)
por: Geldenhuys, Christiaan M., et al.
Publicado: (2025)
Vision-Integrated High-Quality Neural Speech Coding
por: Guo, Yao, et al.
Publicado: (2025)
por: Guo, Yao, et al.
Publicado: (2025)
Conformer-based Ultrasound-to-Speech Conversion
por: Ibrahimov, Ibrahim, et al.
Publicado: (2025)
por: Ibrahimov, Ibrahim, et al.
Publicado: (2025)
SpeechEditBench: A Bilingual Multi-Attribute Benchmark for Instruction-Guided Speech Editing
por: Zhang, Hanlin, et al.
Publicado: (2026)
por: Zhang, Hanlin, et al.
Publicado: (2026)
Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control
por: Lu, Ye-Xin, et al.
Publicado: (2024)
por: Lu, Ye-Xin, et al.
Publicado: (2024)
Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality
por: Raissi, Tina, et al.
Publicado: (2024)
por: Raissi, Tina, et al.
Publicado: (2024)
Low-Latency Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks
por: Ai, Yang, et al.
Publicado: (2024)
por: Ai, Yang, et al.
Publicado: (2024)
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network
por: Zheng, Youqiang, et al.
Publicado: (2024)
por: Zheng, Youqiang, et al.
Publicado: (2024)
AntiDeepFake: AI for Deep Fake Speech Recognition
por: Togootogtokh, Enkhtogtokh, et al.
Publicado: (2024)
por: Togootogtokh, Enkhtogtokh, et al.
Publicado: (2024)
All Neural Low-latency Directional Speech Extraction
por: Pandey, Ashutosh, et al.
Publicado: (2024)
por: Pandey, Ashutosh, et al.
Publicado: (2024)
Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation
por: Li, Jiaqi, et al.
Publicado: (2024)
por: Li, Jiaqi, et al.
Publicado: (2024)
On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech Enhancement
por: Hsieh, Tsun-An, et al.
Publicado: (2024)
por: Hsieh, Tsun-An, et al.
Publicado: (2024)
Universal Preference-Score-based Pairwise Speech Quality Assessment
por: Shi, Yu-Fei, et al.
Publicado: (2025)
por: Shi, Yu-Fei, et al.
Publicado: (2025)
Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
por: Suda, Hitoshi, et al.
Publicado: (2025)
por: Suda, Hitoshi, et al.
Publicado: (2025)
Regularizing Learnable Feature Extraction for Automatic Speech Recognition
por: Vieting, Peter, et al.
Publicado: (2025)
por: Vieting, Peter, et al.
Publicado: (2025)
Deep Speech Synthesis from Multimodal Articulatory Representations
por: Wu, Peter, et al.
Publicado: (2024)
por: Wu, Peter, et al.
Publicado: (2024)
ToneUnit: A Speech Discretization Approach for Tonal Language Speech Synthesis
por: Tao, Dehua, et al.
Publicado: (2024)
por: Tao, Dehua, et al.
Publicado: (2024)
Ejemplares similares
-
Anonymising Elderly and Pathological Speech: Voice Conversion Using DDSP and Query-by-Example
por: Ghosh, Suhita, et al.
Publicado: (2024) -
Temporal Feature Learning in Weakly Labelled Bioacoustic Cetacean Datasets via a Variational Autoencoder and Temporal Convolutional Network: An Interdisciplinary Approach
por: Fonollosa, Laia Garrobé, et al.
Publicado: (2024) -
Computational bioacoustics with deep learning: a review and roadmap
por: Stowell, Dan
Publicado: (2021) -
Adaptive Representations of Sound for Automatic Insect Recognition
por: Faiß, Marius, et al.
Publicado: (2023) -
Learning to detect an animal sound from five examples
por: Nolasco, Inês, et al.
Publicado: (2023)