:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Akaishi, Natsuki, Holighaus, Nicki, Yatabe, Kohei
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2602.16421
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Phase-Based Signal Representations for Scattering
by: Haider, Daniel, et al.
Published: (2022)

Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers
by: Masuyama, Yoshiki, et al.
Published: (2025)

Subband Splitting: Simple, Efficient and Effective Technique for Solving Block Permutation Problem in Determined Blind Source Separation
by: Matsumoto, Kazuki, et al.
Published: (2024)

Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
by: Imamura, Kanami, et al.
Published: (2025)

Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
by: Imamura, Kanami, et al.
Published: (2023)

FAST: Fast Audio Spectrogram Transformer
by: Naman, Anugunj, et al.
Published: (2025)

Musical Source Separation of Brazilian Percussion
by: Namballa, Richa, et al.
Published: (2025)

Audio Compression using Periodic Gabor with Biorthogonal Exchange: Implementation Using the Zak Transform
by: Alimi, Roger, et al.
Published: (2025)

Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology
by: Nobukawa, Rinka, et al.
Published: (2025)

Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers
by: Selvaraj, Nithish Muthuchamy, et al.
Published: (2023)

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)

ASGIR: Audio Spectrogram Transformer Guided Classification And Information Retrieval For Birds
by: Chaudhuri, Yashwardhan, et al.
Published: (2024)

ASM: Audio Spectrogram Mixer
by: Ji, Qingfeng, et al.
Published: (2024)

Evaluating CNN with Stacked Feature Representations and Audio Spectrogram Transformer Models for Sound Classification
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)

Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)

Speech-Declipping Transformer with Complex Spectrogram and Learnerble Temporal Features
by: Kwon, Younghoo, et al.
Published: (2024)

A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
by: Saijo, Kohei, et al.
Published: (2025)

Sound Safeguarding for Acoustic Measurement Using Any Sounds: Tools and Applications
by: Kawahara, Hideki, et al.
Published: (2025)

Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition
by: Iakovenko, Olga, et al.
Published: (2024)

Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
by: Xue, Ke, et al.
Published: (2026)

Synthesizer Sound Matching Using Audio Spectrogram Transformers
by: Bruford, Fred, et al.
Published: (2024)

Enhancing Spectrogram Realism in Singing Voice Synthesis via Explicit Bandwidth Extension Prior to Vocoder
by: Yang, Runxuan, et al.
Published: (2025)

Ambisonics Binaural Rendering via Masked Magnitude Least Squares
by: Berebi, Or, et al.
Published: (2025)

Learning Magnitude Distribution of Sound Fields via Conditioned Autoencoder
by: Koyama, Shoichi, et al.
Published: (2025)

Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
by: Dixit, Satvik, et al.
Published: (2024)

A Practical Guide to Spectrogram Analysis for Audio Signal Processing
by: Khodzhaev, Zulfidin
Published: (2024)

Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task
by: Phan, Dang Thoai
Published: (2024)

Leveraging AM and FM Rhythm Spectrograms for Dementia Classification and Assessment
by: Gogoi, Parismita, et al.
Published: (2025)

ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions
by: Feng, Jiu, et al.
Published: (2024)

From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
by: Feng, Jiu, et al.
Published: (2024)

Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
by: Mattursun, Alimjan, et al.
Published: (2025)

Distilling Spectrograms into Tokens: Fast and Lightweight Bioacoustic Classification for BirdCLEF+ 2025
by: Miyaguchi, Anthony, et al.
Published: (2025)

Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer
by: Ariyanti, Whenty, et al.
Published: (2024)

Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
by: Pina, Leonardo, et al.
Published: (2024)

Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification
by: Bae, Sangmin, et al.
Published: (2023)

Proposal of protocols for speech materials acquisition and presentation assisted by tools based on structured test signals
by: Kawahara, Hideki, et al.
Published: (2024)

SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
by: Comunità, Marco, et al.
Published: (2024)

METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation via Transformer VAE
by: Le, Dinh-Viet-Toan, et al.
Published: (2024)

SGPA: Spectrogram-Guided Phonetic Alignment for Feasible Shapley Value Explanations in Multimodal Large Language Models
by: Pozorski, Paweł, et al.
Published: (2026)

DMF2Mel: A Dynamic Multiscale Fusion Network for EEG-Driven Mel Spectrogram Reconstruction
by: Fan, Cunhang, et al.
Published: (2025)