:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kim, June-Woo, Lee, Sanghoon, Toikkanen, Miika, Hwang, Daehwan, Kim, Kyunghoon
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Sound
Online Access:	https://arxiv.org/abs/2505.06271
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Respiratory Sound Classification with Architecture-Agnostic Knowledge Distillation from Ensembles
by: Toikkanen, Miika, et al.
Published: (2025)

Meta-Ensemble Learning with Diverse Data Splits for Improved Respiratory Sound Classification
by: Kim, June-Woo, et al.
Published: (2026)

Mitigating Stethoscope-Induced Shortcuts in Respiratory Sound Classification under Federated Domain Generalization with Causality-Inspired Interventions
by: Koo, Heejoon, et al.
Published: (2026)

RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification
by: Kim, June-Woo, et al.
Published: (2024)

BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification
by: Kim, June-Woo, et al.
Published: (2024)

Noise-Agnostic Multitask Whisper Training for Reducing False Alarm Errors in Call-for-Help Detection
by: Ryu, Myeonghoon, et al.
Published: (2025)

Explainable Multi-Modal Deep Learning for Automatic Detection of Lung Diseases from Respiratory Audio Signals
by: Saky, S M Asiful Islam, et al.
Published: (2025)

AFEN: Respiratory Disease Classification using Ensemble Learning
by: Nadkarni, Rahul, et al.
Published: (2024)

Understanding Frechet Speech Distance for Synthetic Speech Quality Evaluation
by: Kim, June-Woo, et al.
Published: (2026)

Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification
by: Bae, Sangmin, et al.
Published: (2023)

Hookpad Aria: A Copilot for Songwriters
by: Donahue, Chris, et al.
Published: (2025)

AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing?
by: Ok, Hyunjong, et al.
Published: (2025)

Alternating Approach-Putt Models for Multi-Stage Speech Enhancement
by: Jeong, Iksoon, et al.
Published: (2025)

Sample-Efficient Diffusion for Text-To-Speech Synthesis
by: Lovelace, Justin, et al.
Published: (2024)

Robust TTS Training via Self-Purifying Flow Matching for the WildSpoof 2026 TTS Track
by: Yi, June Young, et al.
Published: (2025)

Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024)

PC-MCL: Patient-Consistent Multi-Cycle Learning with multi-label bias correction for respiratory sound classification
by: Jeong, Seung Gyu, et al.
Published: (2026)

Imagined Speech State Classification for Robust Brain-Computer Interface
by: Ko, Byung-Kwan, et al.
Published: (2024)

SAND Challenge: Four Approaches for Dysartria Severity Classification
by: Deshpande, Gauri, et al.
Published: (2025)

RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024)

Let There Be Sound: Reconstructing High Quality Speech from Silent Videos
by: Kim, Ji-Hoon, et al.
Published: (2023)

SafeDPO: A Simple Approach to Direct Preference Optimization with Enhanced Safety
by: Kim, Geon-Hyeong, et al.
Published: (2025)

Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
by: Novack, Zachary, et al.
Published: (2026)

Preference-Based Learning in Audio Applications: A Systematic Analysis
by: Broukhim, Aaron, et al.
Published: (2025)

A Human-Inspired Decoupled Architecture for Efficient Audio Representation Learning
by: Kawano, Harunori, et al.
Published: (2026)

Geometry-Aware Optimization for Respiratory Sound Classification: Enhancing Sensitivity with SAM-Optimized Audio Spectrogram Transformers
by: Işık, Atakan, et al.
Published: (2025)

D3RM: A Discrete Denoising Diffusion Refinement Model for Piano Transcription
by: Kim, Hounsu, et al.
Published: (2025)

Music Plagiarism Detection: Problem Formulation and a Segment-based Solution
by: Go, Seonghyeon, et al.
Published: (2026)

AUDRON: A Deep Learning Framework with Fused Acoustic Signatures for Drone Type Recognition
by: Chatterjee, Rajdeep, et al.
Published: (2025)

Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)

DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors
by: Lee, Keon, et al.
Published: (2024)

Logit Distillation on Manifolds: Mapping by Learning
by: Yang, Yiru, et al.
Published: (2026)

Whisfusion: Parallel ASR Decoding via a Diffusion Transformer
by: Kwon, Taeyoun, et al.
Published: (2025)

Myna: Masking-Based Contrastive Learning of Musical Representations
by: Yonay, Ori, et al.
Published: (2025)

AudioMosaic: Contrastive Masked Audio Representation Learning
by: Huang, Hanxun, et al.
Published: (2026)

Multi-Task Learning for Lung sound & Lung disease classification
by: K V, Suma, et al.
Published: (2024)

Phase-Aware Deep Learning with Complex-Valued CNNs for Audio Signal Applications
by: Agrawal, Naman
Published: (2025)

Towards Human-in-the-Loop Onset Detection: A Transfer Learning Approach for Maracatu
by: Pinto, António Sá
Published: (2025)

Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards
by: Fang, Linghan, et al.
Published: (2026)

QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems
by: Wang, Chien-Chun, et al.
Published: (2025)