:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Herreilers, Julian, Jacobs, Christiaan, Niesler, Thomas
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Audio and Speech Processing
Online-Zugang:	https://arxiv.org/abs/2506.17690
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Multilingual acoustic word embeddings for zero-resource languages
von: Jacobs, Christiaan
Veröffentlicht: (2024)

Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
von: Geldenhuys, Christiaan M., et al.
Veröffentlicht: (2024)

Multitaper mel-spectrograms for keyword spotting
von: de Souza, Douglas Baptista, et al.
Veröffentlicht: (2024)

From Birdsong to Rumbles: Classifying Elephant Calls with Out-of-Species Embeddings
von: Geldenhuys, Christiaan M., et al.
Veröffentlicht: (2026)

Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning
von: Yang, Qiaoyu
Veröffentlicht: (2025)

Boosting keyword spotting through on-device learnable user speech characteristics
von: Cioflan, Cristian, et al.
Veröffentlicht: (2024)

WhaleVAD-BPN: Improving Baleen Whale Call Detection with Boundary Proposal Networks and Post-processing Optimisation
von: Geldenhuys, Christiaan M., et al.
Veröffentlicht: (2025)

The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
von: Zhu, Jian, et al.
Veröffentlicht: (2023)

Open vocabulary keyword spotting through transfer learning from speech synthesis
von: V, Kesavaraj, et al.
Veröffentlicht: (2024)

Guiding the underwater acoustic target recognition with interpretable contrastive learning
von: Xie, Yuan, et al.
Veröffentlicht: (2024)

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder
von: Bittar, Alexandre, et al.
Veröffentlicht: (2023)

Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA
von: Jeziorek, Kamil, et al.
Veröffentlicht: (2026)

Visually grounded few-shot word learning in low-resource settings
von: Nortje, Leanne, et al.
Veröffentlicht: (2023)

Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
von: San, Nay, et al.
Veröffentlicht: (2024)

Complexity boosted adaptive training for better low resource ASR performance
von: Lu, Hongxuan, et al.
Veröffentlicht: (2024)

Progressive unsupervised domain adaptation for ASR using ensemble models and multi-stage training
von: Ahmad, Rehan, et al.
Veröffentlicht: (2024)

Challenging margin-based speaker embedding extractors by using the variational information bottleneck
von: Stafylakis, Themos, et al.
Veröffentlicht: (2024)

Cough activity detection for automatic tuberculosis screening
von: van Vüren, Joshua Jansen, et al.
Veröffentlicht: (2026)

Automatically assessing oral narratives of Afrikaans and isiXhosa children
von: Louw, Retief, et al.
Veröffentlicht: (2025)

Robust DOA estimation using deep acoustic imaging
von: Roman, Adrian S., et al.
Veröffentlicht: (2024)

Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
von: Zhang, Li, et al.
Veröffentlicht: (2024)

From perception to production: how acoustic invariance facilitates articulatory learning in a self-supervised vocal imitation model
von: Lavechin, Marvin, et al.
Veröffentlicht: (2025)

Cross-lingual Data Selection Using Clip-level Acoustic Similarity for Enhancing Low-resource Automatic Speech Recognition
von: Mitsumori, Shunsuke, et al.
Veröffentlicht: (2025)

Complete reconstruction of the tongue contour through acoustic to articulatory inversion using real-time MRI data
von: Azzouz, Sofiane, et al.
Veröffentlicht: (2024)

Investigation of perception inconsistency in speaker embedding for asynchronous voice anonymization
von: Wang, Rui, et al.
Veröffentlicht: (2025)

Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives
von: Jacobs, Christiaan, et al.
Veröffentlicht: (2025)

ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams
von: Anand, Srija, et al.
Veröffentlicht: (2024)

Geodesic interpolation of frame-wise speaker embeddings for the diarization of meeting scenarios
von: Cord-Landwehr, Tobias, et al.
Veröffentlicht: (2024)

Neural acoustic multipole splatting for room impulse response synthesis
von: Baek, Geonwoo, et al.
Veröffentlicht: (2025)

Evaluating pretrained speech embedding systems for dysarthria detection across heterogenous datasets
von: Wihlborg, Lovisa, et al.
Veröffentlicht: (2025)

Tandem spoofing-robust automatic speaker verification based on time-domain embeddings
von: Weizman, Avishai, et al.
Veröffentlicht: (2024)

Improving acoustic drone detection generalization through pretraining and data augmentation
von: Reuter, Paul M., et al.
Veröffentlicht: (2026)

Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
von: Yokota, Kazuya, et al.
Veröffentlicht: (2023)

A state-space representation of the boundary integral equation for room acoustic modelling
von: Ali, Randall, et al.
Veröffentlicht: (2026)

Spoken-Term Discovery using Discrete Speech Units
von: van Niekerk, Benjamin, et al.
Veröffentlicht: (2024)

Target word activity detector: An approach to obtain ASR word boundaries without lexicon
von: Sivasankaran, Sunit, et al.
Veröffentlicht: (2024)

Perceptual implications of simplifying geometrical acoustics models for Ambisonics-based binaural reverberation
von: Martin, Vincent, et al.
Veröffentlicht: (2024)

Room acoustics affect communicative success in hybrid meeting spaces: a pilot study
von: Einig, Robert, et al.
Veröffentlicht: (2025)

Post-training for Deepfake Speech Detection
von: Ge, Wanying, et al.
Veröffentlicht: (2025)

QiandaoEar22: A high quality noise dataset for identifying specific ship from multiple underwater acoustic targets using ship-radiated noise
von: Du, Xiaoyang, et al.
Veröffentlicht: (2024)