Saved in:
| Main Authors: | Akaishi, Natsuki, Holighaus, Nicki, Yatabe, Kohei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.16421 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Phase-Based Signal Representations for Scattering
by: Haider, Daniel, et al.
Published: (2022)
by: Haider, Daniel, et al.
Published: (2022)
Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers
by: Masuyama, Yoshiki, et al.
Published: (2025)
by: Masuyama, Yoshiki, et al.
Published: (2025)
Subband Splitting: Simple, Efficient and Effective Technique for Solving Block Permutation Problem in Determined Blind Source Separation
by: Matsumoto, Kazuki, et al.
Published: (2024)
by: Matsumoto, Kazuki, et al.
Published: (2024)
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
by: Imamura, Kanami, et al.
Published: (2025)
by: Imamura, Kanami, et al.
Published: (2025)
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
by: Imamura, Kanami, et al.
Published: (2023)
by: Imamura, Kanami, et al.
Published: (2023)
FAST: Fast Audio Spectrogram Transformer
by: Naman, Anugunj, et al.
Published: (2025)
by: Naman, Anugunj, et al.
Published: (2025)
Musical Source Separation of Brazilian Percussion
by: Namballa, Richa, et al.
Published: (2025)
by: Namballa, Richa, et al.
Published: (2025)
Audio Compression using Periodic Gabor with Biorthogonal Exchange: Implementation Using the Zak Transform
by: Alimi, Roger, et al.
Published: (2025)
by: Alimi, Roger, et al.
Published: (2025)
Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology
by: Nobukawa, Rinka, et al.
Published: (2025)
by: Nobukawa, Rinka, et al.
Published: (2025)
Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers
by: Selvaraj, Nithish Muthuchamy, et al.
Published: (2023)
by: Selvaraj, Nithish Muthuchamy, et al.
Published: (2023)
ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram
by: Jiang, Xiao-Hang, et al.
Published: (2024)
by: Jiang, Xiao-Hang, et al.
Published: (2024)
ASGIR: Audio Spectrogram Transformer Guided Classification And Information Retrieval For Birds
by: Chaudhuri, Yashwardhan, et al.
Published: (2024)
by: Chaudhuri, Yashwardhan, et al.
Published: (2024)
ASM: Audio Spectrogram Mixer
by: Ji, Qingfeng, et al.
Published: (2024)
by: Ji, Qingfeng, et al.
Published: (2024)
Evaluating CNN with Stacked Feature Representations and Audio Spectrogram Transformer Models for Sound Classification
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
by: Dehaghania, Parinaz Binandeh, et al.
Published: (2026)
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training
by: Schmid, Florian, et al.
Published: (2024)
by: Schmid, Florian, et al.
Published: (2024)
Speech-Declipping Transformer with Complex Spectrogram and Learnerble Temporal Features
by: Kwon, Younghoo, et al.
Published: (2024)
by: Kwon, Younghoo, et al.
Published: (2024)
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
by: Saijo, Kohei, et al.
Published: (2025)
by: Saijo, Kohei, et al.
Published: (2025)
Sound Safeguarding for Acoustic Measurement Using Any Sounds: Tools and Applications
by: Kawahara, Hideki, et al.
Published: (2025)
by: Kawahara, Hideki, et al.
Published: (2025)
Convolutional Variational Autoencoders for Spectrogram Compression in Automatic Speech Recognition
by: Iakovenko, Olga, et al.
Published: (2024)
by: Iakovenko, Olga, et al.
Published: (2024)
Dual-View Predictive Diffusion: Lightweight Speech Enhancement via Spectrogram-Image Synergy
by: Xue, Ke, et al.
Published: (2026)
by: Xue, Ke, et al.
Published: (2026)
Synthesizer Sound Matching Using Audio Spectrogram Transformers
by: Bruford, Fred, et al.
Published: (2024)
by: Bruford, Fred, et al.
Published: (2024)
Enhancing Spectrogram Realism in Singing Voice Synthesis via Explicit Bandwidth Extension Prior to Vocoder
by: Yang, Runxuan, et al.
Published: (2025)
by: Yang, Runxuan, et al.
Published: (2025)
Ambisonics Binaural Rendering via Masked Magnitude Least Squares
by: Berebi, Or, et al.
Published: (2025)
by: Berebi, Or, et al.
Published: (2025)
Learning Magnitude Distribution of Sound Fields via Conditioned Autoencoder
by: Koyama, Shoichi, et al.
Published: (2025)
by: Koyama, Shoichi, et al.
Published: (2025)
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
A Practical Guide to Spectrogram Analysis for Audio Signal Processing
by: Khodzhaev, Zulfidin
Published: (2024)
by: Khodzhaev, Zulfidin
Published: (2024)
Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task
by: Phan, Dang Thoai
Published: (2024)
by: Phan, Dang Thoai
Published: (2024)
Leveraging AM and FM Rhythm Spectrograms for Dementia Classification and Assessment
by: Gogoi, Parismita, et al.
Published: (2025)
by: Gogoi, Parismita, et al.
Published: (2025)
ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions
by: Feng, Jiu, et al.
Published: (2024)
by: Feng, Jiu, et al.
Published: (2024)
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
by: Feng, Jiu, et al.
Published: (2024)
by: Feng, Jiu, et al.
Published: (2024)
Magnitude-Phase Dual-Path Speech Enhancement Network based on Self-Supervised Embedding and Perceptual Contrast Stretch Boosting
by: Mattursun, Alimjan, et al.
Published: (2025)
by: Mattursun, Alimjan, et al.
Published: (2025)
Distilling Spectrograms into Tokens: Fast and Lightweight Bioacoustic Classification for BirdCLEF+ 2025
by: Miyaguchi, Anthony, et al.
Published: (2025)
by: Miyaguchi, Anthony, et al.
Published: (2025)
Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer
by: Ariyanti, Whenty, et al.
Published: (2024)
by: Ariyanti, Whenty, et al.
Published: (2024)
Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
by: Pina, Leonardo, et al.
Published: (2024)
by: Pina, Leonardo, et al.
Published: (2024)
Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification
by: Bae, Sangmin, et al.
Published: (2023)
by: Bae, Sangmin, et al.
Published: (2023)
Proposal of protocols for speech materials acquisition and presentation assisted by tools based on structured test signals
by: Kawahara, Hideki, et al.
Published: (2024)
by: Kawahara, Hideki, et al.
Published: (2024)
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
by: Comunità, Marco, et al.
Published: (2024)
by: Comunità, Marco, et al.
Published: (2024)
METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation via Transformer VAE
by: Le, Dinh-Viet-Toan, et al.
Published: (2024)
by: Le, Dinh-Viet-Toan, et al.
Published: (2024)
SGPA: Spectrogram-Guided Phonetic Alignment for Feasible Shapley Value Explanations in Multimodal Large Language Models
by: Pozorski, Paweł, et al.
Published: (2026)
by: Pozorski, Paweł, et al.
Published: (2026)
DMF2Mel: A Dynamic Multiscale Fusion Network for EEG-Driven Mel Spectrogram Reconstruction
by: Fan, Cunhang, et al.
Published: (2025)
by: Fan, Cunhang, et al.
Published: (2025)
Similar Items
-
Phase-Based Signal Representations for Scattering
by: Haider, Daniel, et al.
Published: (2022) -
Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers
by: Masuyama, Yoshiki, et al.
Published: (2025) -
Subband Splitting: Simple, Efficient and Effective Technique for Solving Block Permutation Problem in Determined Blind Source Separation
by: Matsumoto, Kazuki, et al.
Published: (2024) -
Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network
by: Imamura, Kanami, et al.
Published: (2025) -
Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides
by: Imamura, Kanami, et al.
Published: (2023)