Saved in:
| Main Author: | Ogg, Mattson |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.02366 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Self-Supervised Speech Quality Assessment (S3QA): Leveraging Speech Foundation Models for a Scalable Speech Quality Metric
by: Ogg, Mattson, et al.
Published: (2025)
by: Ogg, Mattson, et al.
Published: (2025)
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature
by: Du, Chenpeng, et al.
Published: (2022)
by: Du, Chenpeng, et al.
Published: (2022)
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
by: Pelloin, Valentin, et al.
Published: (2026)
by: Pelloin, Valentin, et al.
Published: (2026)
GigaAM: Efficient Self-Supervised Learner for Speech Recognition
by: Kutsakov, Aleksandr, et al.
Published: (2025)
by: Kutsakov, Aleksandr, et al.
Published: (2025)
Self-Supervised Multi-View Learning for Disentangled Music Audio Representations
by: Wilkins, Julia, et al.
Published: (2024)
by: Wilkins, Julia, et al.
Published: (2024)
Domain-Incremental Learning for Audio Classification
by: Mulimani, Manjunath, et al.
Published: (2024)
by: Mulimani, Manjunath, et al.
Published: (2024)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification
by: Heggan, Calum, et al.
Published: (2024)
by: Heggan, Calum, et al.
Published: (2024)
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
by: Shams, Siavash, et al.
Published: (2024)
by: Shams, Siavash, et al.
Published: (2024)
Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach
by: Abeßer, Jakob, et al.
Published: (2025)
by: Abeßer, Jakob, et al.
Published: (2025)
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
by: Wu, Shih-Lun, et al.
Published: (2023)
by: Wu, Shih-Lun, et al.
Published: (2023)
DASS: Distilled Audio State Space Models Are Stronger and More Duration-Scalable Learners
by: Bhati, Saurabhchand, et al.
Published: (2024)
by: Bhati, Saurabhchand, et al.
Published: (2024)
UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner
by: Yang, Dongchao, et al.
Published: (2024)
by: Yang, Dongchao, et al.
Published: (2024)
Transfer Learning for Paediatric Sleep Apnoea Detection Using Physiology-Guided Acoustic Models
by: Niu, Chaoyue, et al.
Published: (2025)
by: Niu, Chaoyue, et al.
Published: (2025)
Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification
by: Cai, Yiqiang, et al.
Published: (2024)
by: Cai, Yiqiang, et al.
Published: (2024)
Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers
by: Cappellazzo, Umberto, et al.
Published: (2023)
by: Cappellazzo, Umberto, et al.
Published: (2023)
Leveraging Self-Supervised Audio-Visual Pretrained Models to Improve Vocoded Speech Intelligibility in Cochlear Implant Simulation
by: Lai, Richard Lee, et al.
Published: (2023)
by: Lai, Richard Lee, et al.
Published: (2023)
Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
by: Guo, Yinlin, et al.
Published: (2023)
by: Guo, Yinlin, et al.
Published: (2023)
DeePAQ: A Perceptual Audio Quality Metric Based On Foundational Models and Weakly Supervised Learning
by: Jiang, Guanxin, et al.
Published: (2025)
by: Jiang, Guanxin, et al.
Published: (2025)
Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer
by: Yang, Bing, et al.
Published: (2023)
by: Yang, Bing, et al.
Published: (2023)
DSCLAP: Domain-Specific Contrastive Language-Audio Pre-Training
by: Liu, Shengqiang, et al.
Published: (2024)
by: Liu, Shengqiang, et al.
Published: (2024)
Acoustic Non-Stationarity Objective Assessment with Hard Label Criteria for Supervised Learning Models
by: Zucatelli, Guilherme, et al.
Published: (2025)
by: Zucatelli, Guilherme, et al.
Published: (2025)
AudioNet: Supervised Deep Hashing for Retrieval of Similar Audio Events
by: Dutta, Sagar, et al.
Published: (2025)
by: Dutta, Sagar, et al.
Published: (2025)
SAGA-SR: Semantically and Acoustically Guided Audio Super-Resolution
by: Im, Jaekwon, et al.
Published: (2025)
by: Im, Jaekwon, et al.
Published: (2025)
Acoustic Teleportation via Disentangled Neural Audio Codec Representations
by: Grundhuber, Philipp, et al.
Published: (2025)
by: Grundhuber, Philipp, et al.
Published: (2025)
Optimizing Domain-Adaptive Self-Supervised Learning for Clinical Voice-Based Disease Classification
by: Liu, Weixin, et al.
Published: (2026)
by: Liu, Weixin, et al.
Published: (2026)
Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning Model
by: Park, Joonyong, et al.
Published: (2024)
by: Park, Joonyong, et al.
Published: (2024)
MiMo-Audio: Audio Language Models are Few-Shot Learners
by: Core Team, et al.
Published: (2025)
by: Core Team, et al.
Published: (2025)
Sub-band Domain Multi-Hypothesis Acoustic Echo Canceler Based Acoustic Scene Analysis
by: Southwell, Benjamin J, et al.
Published: (2025)
by: Southwell, Benjamin J, et al.
Published: (2025)
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection
by: Cai, Pengfei, et al.
Published: (2024)
by: Cai, Pengfei, et al.
Published: (2024)
Domain Adaptation for Contrastive Audio-Language Models
by: Deshmukh, Soham, et al.
Published: (2024)
by: Deshmukh, Soham, et al.
Published: (2024)
Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection
by: Gulzar, Kashaf, et al.
Published: (2026)
by: Gulzar, Kashaf, et al.
Published: (2026)
Can you Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?
by: Aldeneh, Zakaria, et al.
Published: (2024)
by: Aldeneh, Zakaria, et al.
Published: (2024)
Past, Present, and Future of Spatial Audio and Room Acoustics
by: Koyama, Shoichi, et al.
Published: (2025)
by: Koyama, Shoichi, et al.
Published: (2025)
Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations
by: Yadav, Sarthak, et al.
Published: (2024)
by: Yadav, Sarthak, et al.
Published: (2024)
Enhancing Audio-Language Models through Self-Supervised Post-Training with Text-Audio Pairs
by: Sinha, Anshuman, et al.
Published: (2024)
by: Sinha, Anshuman, et al.
Published: (2024)
Leveraging Audio-Visual Data to Reduce the Multilingual Gap in Self-Supervised Speech Models
by: Blandón, María Andrea Cruz, et al.
Published: (2025)
by: Blandón, María Andrea Cruz, et al.
Published: (2025)
SONAR: Self-Distilled Continual Pre-training for Domain Adaptive Audio Representation
by: Zhang, Yizhou, et al.
Published: (2025)
by: Zhang, Yizhou, et al.
Published: (2025)
Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks
by: Elias, Noel
Published: (2024)
by: Elias, Noel
Published: (2024)
MATS: An Audio Language Model under Text-only Supervision
by: Wang, Wen, et al.
Published: (2025)
by: Wang, Wen, et al.
Published: (2025)
Similar Items
-
Self-Supervised Speech Quality Assessment (S3QA): Leveraging Speech Foundation Models for a Scalable Speech Quality Metric
by: Ogg, Mattson, et al.
Published: (2025) -
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature
by: Du, Chenpeng, et al.
Published: (2022) -
Data Selection Effects on Self-Supervised Learning of Audio Representations for French Audiovisual Broadcasts
by: Pelloin, Valentin, et al.
Published: (2026) -
GigaAM: Efficient Self-Supervised Learner for Speech Recognition
by: Kutsakov, Aleksandr, et al.
Published: (2025) -
Self-Supervised Multi-View Learning for Disentangled Music Audio Representations
by: Wilkins, Julia, et al.
Published: (2024)