:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Paradise, Orr, Muralikrishnan, Pranav, Chen, Liangyuan, García, Hugo Flores, Pardo, Bryan, Diamant, Roee, Gruber, David F., Gero, Shane, Goldwasser, Shafi
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Machine Learning Sound
Accesso online:	https://arxiv.org/abs/2512.02206
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Automatic Detection and Annotation of Sperm Whale Codas
di: Gubnitsky, Guy, et al.
Pubblicazione: (2024)

Detecting the presence of sperm whales echolocation clicks in noisy environments
di: Gubnitsky, Guy, et al.
Pubblicazione: (2023)

Models That Prove Their Own Correctness
di: Amit, Noga, et al.
Pubblicazione: (2024)

Unsupervised Translation of Emergent Communication
di: Levy, Ido, et al.
Pubblicazione: (2025)

Automatic detection and annotation of eastern Caribbean sperm whale codas.
di: Gubnitsky, Guy, et al.
Pubblicazione: (2025)

ShipEcho -- An Interactive Tool for Global Mapping of Underwater Radiated Noise from Vessels
di: Shipton, Mark, et al.
Pubblicazione: (2026)

Review of Cetacean's click detection algorithms
di: Gracic, Mak, et al.
Pubblicazione: (2024)

Investigating the Development of Task-Oriented Communication in Vision-Language Models
di: Carmeli, Boaz, et al.
Pubblicazione: (2026)

Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
di: Barnett, Julia, et al.
Pubblicazione: (2024)

Multiple Mobile Target Detection and Tracking in Active Sonar Array Using a Track-Before-Detect Approach
di: Abu, Avi, et al.
Pubblicazione: (2024)

Automated Detection of Dolphin Whistles with Convolutional Networks and Transfer Learning
di: Korkmaz, Burla Nur, et al.
Pubblicazione: (2022)

Approaching an unknown communication system by latent space exploration and causal inference
di: Beguš, Gašper, et al.
Pubblicazione: (2023)

Mix2Morph: Learning Sound Morphing from Noisy Mixes
di: Chu, Annie, et al.
Pubblicazione: (2026)

Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations
di: García, Hugo Flores, et al.
Pubblicazione: (2024)

On Non-interactive Evaluation of Animal Communication Translators
di: Paradise, Orr, et al.
Pubblicazione: (2025)

Vowel- and Diphthong-Like Spectral Patterns in Sperm Whale Codas.
di: Beguš, Gašper, et al.
Pubblicazione: (2025)

The Rhythm In Anything: Audio-Prompted Drums Generation with Masked Language Modeling
di: O'Reilly, Patrick, et al.
Pubblicazione: (2025)

Towards the Synthesis of Non-speech Vocalizations
di: Hoq, Enjamamul, et al.
Pubblicazione: (2024)

Code Drift: Towards Idempotent Neural Audio Codecs
di: O'Reilly, Patrick, et al.
Pubblicazione: (2024)

Learning Randomized Reductions
di: Erata, Ferhat, et al.
Pubblicazione: (2024)

High-Fidelity Neural Phonetic Posteriorgrams
di: Churchwell, Cameron, et al.
Pubblicazione: (2024)

Mel-RoFormer for Vocal Separation and Vocal Melody Transcription
di: Wang, Ju-Chiang, et al.
Pubblicazione: (2024)

MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation
di: Chen, Szu-Chi, et al.
Pubblicazione: (2026)

Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack
di: Ziv, Roee, et al.
Pubblicazione: (2025)

VocalParse: Towards Unified and Scalable Singing Voice Transcription with Large Audio Language Models
di: Chen, Yukun, et al.
Pubblicazione: (2026)

Cross-domain Neural Pitch and Periodicity Estimation
di: Morrison, Max, et al.
Pubblicazione: (2023)

Fine-Grained and Interpretable Neural Speech Editing
di: Morrison, Max, et al.
Pubblicazione: (2024)

Efficient Vocal Source Separation Through Windowed Sink Attention
di: Benetatos, Christodoulos, et al.
Pubblicazione: (2025)

UniVocal: Unified Speech-Singing Code-Switching Synthesis
di: Shi, Yufei, et al.
Pubblicazione: (2026)

NVBench: A Benchmark for Speech Synthesis with Non-Verbal Vocalizations
di: Xue, Liumeng, et al.
Pubblicazione: (2026)

Smule Renaissance Small: Efficient General-Purpose Vocal Restoration
di: Zang, Yongyi, et al.
Pubblicazione: (2025)

Affectron: Emotional Speech Synthesis with Affective and Contextually Aligned Nonverbal Vocalizations
di: Cho, Deok-Hyeon, et al.
Pubblicazione: (2026)

LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation
di: Wang, Qi, et al.
Pubblicazione: (2026)

Do Joint Language-Audio Embeddings Encode Perceptual Timbre Semantics?
di: Deng, Qixin, et al.
Pubblicazione: (2025)

A Dataset for Automatic Vocal Mode Classification
di: Hinrichs, Reemt, et al.
Pubblicazione: (2026)

Deep Audio Watermarks are Shallow: Limitations of Post-Hoc Watermarking Techniques for Speech
di: O'Reilly, Patrick, et al.
Pubblicazione: (2025)

Text2FX: Harnessing CLAP Embeddings for Text-Guided Audio Effects
di: Chu, Annie, et al.
Pubblicazione: (2024)

CVSM: Contrastive Vocal Similarity Modeling
di: Garoufis, Christos, et al.
Pubblicazione: (2025)

VocalAgent: Large Language Models for Vocal Health Diagnostics with Safety-Aware Evaluation
di: Kim, Yubin, et al.
Pubblicazione: (2025)

Ethics Statements in AI Music Papers: The Effective and the Ineffective
di: Barnett, Julia, et al.
Pubblicazione: (2025)