:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Versace, Plein
Format:	Preprint
Published:	2025
Subjects:	Sound Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.18384
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scaling Implicit Fields via Hypernetwork-Driven Multiscale Coordinate Transformations
by: Versace, Plein
Published: (2025)

Machine Anomalous Sound Detection Using Spectral-temporal Modulation Representations Derived from Machine-specific Filterbanks
by: Li, Kai, et al.
Published: (2024)

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space
by: Zhang, Bowen, et al.
Published: (2026)

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
by: Vallés-Pérez, Ivan, et al.
Published: (2023)

SONAR: Spectral-Contrastive Audio Residuals for Generalizable Deepfake Detection
by: HIdekel, Ido Nitzan, et al.
Published: (2025)

Full-Frequency Temporal Patching and Structured Masking for Enhanced Audio Classification
by: Makineni, Aditya, et al.
Published: (2025)

Audio Mamba: Bidirectional State Space Model for Audio Representation Learning
by: Erol, Mehmet Hamza, et al.
Published: (2024)

Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations
by: Yadav, Sarthak, et al.
Published: (2024)

TFGA-Net: Temporal-Frequency Graph Attention Network for Brain-Controlled Speaker Extraction
by: Si, Youhao, et al.
Published: (2025)

Sub-Band Spectral Matching with Localized Score Aggregation for Robust Anomalous Sound Detection
by: Saengthong, Phurich, et al.
Published: (2026)

Modeling Music as a Time-Frequency Image: A 2D Tokenizer for Music Generation
by: Cheng, Yuqing, et al.
Published: (2026)

SPEAR: Receiver-to-Receiver Acoustic Neural Warping Field
by: He, Yuhang, et al.
Published: (2024)

Interpretable All-Type Audio Deepfake Detection with Audio LLMs via Frequency-Time Reinforcement Learning
by: Xie, Yuankun, et al.
Published: (2026)

Infant Cry Detection Using Causal Temporal Representation
by: Fu, Minghao, et al.
Published: (2025)

Perceptually Aligning Representations of Music via Noise-Augmented Autoencoders
by: Bjare, Mathias Rose, et al.
Published: (2025)

Deepfake Audio Detection Using Self-supervised Fusion Representations
by: Zaman, Khalid, et al.
Published: (2026)

Masked Latent Prediction and Classification for Self-Supervised Audio Representation Learning
by: Quelennec, Aurian, et al.
Published: (2025)

Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification
by: Kim, Jin Sob, et al.
Published: (2025)

State Space Models for Bioacoustics: A Comparative Evaluation with Transformers
by: Tang, Chengyu, et al.
Published: (2025)

A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis
by: Ni-Hahn, Stephen, et al.
Published: (2024)

Training-Efficient Text-to-Music Generation with State-Space Modeling
by: Lee, Wei-Jaw, et al.
Published: (2026)

MATPAC++: Enhanced Masked Latent Prediction for Self-Supervised Audio Representation Learning
by: Quelennec, Aurian, et al.
Published: (2025)

Spectral Masking and Interpolation Attack (SMIA): A Black-box Adversarial Attack against Voice Authentication and Anti-Spoofing Systems
by: Kamel, Kamel, et al.
Published: (2025)

Same Words, Different Judgments: How Preferences Vary Across Modalities
by: Broukhim, Aaron, et al.
Published: (2026)

Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
by: Han, Yichen, et al.
Published: (2025)

UniWhisper: Efficient Continual Multi-task Training for Robust Universal Audio Representation
by: Chen, Yuxuan, et al.
Published: (2026)

Cross-Cultural Bias in Mel-Scale Representations: Evidence and Alternatives from Speech and Music
by: Chauhan, Shivam, et al.
Published: (2026)

Composer Vector: Style-steering Symbolic Music Generation in a Latent Space
by: Jiang, Xunyi, et al.
Published: (2026)

Universal Speech Token Learning via Low-Bitrate Neural Codec and Pretrained Representations
by: Jiang, Xue, et al.
Published: (2025)

Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training
by: Fang, Xin, et al.
Published: (2025)

Probabilistic Fusion and Calibration of Neural Speaker Diarization Models
by: Alvarez-Trejos, Juan Ignacio, et al.
Published: (2025)

Toward Complex-Valued Neural Networks for Waveform Generation
by: Oh, Hyung-Seok, et al.
Published: (2026)

Latent-Mark: An Audio Watermark Robust to Neural Resynthesis
by: Chen, Yen-Shan, et al.
Published: (2026)

Evaluating Neural Networks Architectures for Spring Reverb Modelling
by: Papaleo, Francesco, et al.
Published: (2024)

Unify Variables in Neural Scaling Laws for General Audio Representations via Embedding Effective Rank
by: Deng, Xuyao, et al.
Published: (2025)

AnalysisGNN: Unified Music Analysis with Graph Neural Networks
by: Karystinaios, Emmanouil, et al.
Published: (2025)

Robust Neural Audio Fingerprinting using Music Foundation Models
by: Singh, Shubhr, et al.
Published: (2025)

Neural personal sound zones with flexible bright zone control
by: Zhu, Wenye, et al.
Published: (2025)

The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs
by: Sadok, Samir, et al.
Published: (2026)

Self Voice Conversion as an Attack against Neural Audio Watermarking
by: Özer, Yigitcan, et al.
Published: (2026)