:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Ito, Nobutaka
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2605.25512
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Fast Multichannel NMF with Block-Diagonal Spatial Covariance Matrices for Efficient Blind Source Separation Using Distributed Microphone Arrays
by: Nishikori, Hirotaka, et al.
Published: (2026)

Subspace Track-before-Detect for Passive Multi-Target Tracking with Unknown Emitted Signals
by: Ito, Nobutaka, et al.
Published: (2026)

Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture Models
by: Cord-Landwehr, Tobias, et al.
Published: (2024)

Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation
by: Wang, Zhong-Qiu
Published: (2024)

30+ Years of Source Separation Research: Achievements and Future Challenges
by: Araki, Shoko, et al.
Published: (2025)

Mel-Spectrogram Inversion via Alternating Direction Method of Multipliers
by: Masuyama, Yoshiki, et al.
Published: (2025)

Unified Diffusion Refinement for Multi-Channel Speech Enhancement and Separation
by: Xu, Zhongweiyang, et al.
Published: (2026)

Geneses: Unified Generative Speech Enhancement and Separation
by: Asai, Kohei, et al.
Published: (2026)

Neural Blind Source Separation and Diarization for Distant Speech Recognition
by: Bando, Yoshiaki, et al.
Published: (2024)

UniArray: Unified Spectral-Spatial Modeling for Array-Geometry-Agnostic Speech Separation
by: Chen, Weiguang, et al.
Published: (2025)

Exploring Efficient Directional and Distance Cues for Regional Speech Separation
by: Jiang, Yiheng, et al.
Published: (2025)

Preserving Speaker Information in Direct Speech-to-Speech Translation with Non-Autoregressive Generation and Pretraining
by: Zhou, Rui, et al.
Published: (2024)

EmoSSLSphere: Multilingual Emotional Speech Synthesis with Spherical Vectors and Discrete Speech Tokens
by: Park, Joonyong, et al.
Published: (2025)

Hyperbolic Distance-Based Speech Separation
by: Petermann, Darius, et al.
Published: (2024)

Single Channel Blind Dereverberation of Speech Signals
by: Nigam, Dhruv
Published: (2025)

What Do Neurons Listen To? A Neuron-level Dissection of a General-purpose Audio Model
by: Kawamura, Takao, et al.
Published: (2026)

Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
by: Shin, Ui-Hyeop, et al.
Published: (2024)

USE: A Unified Model for Universal Sound Separation and Extraction
by: Wang, Hongyu, et al.
Published: (2025)

Determined Multichannel Blind Source Separation with Clustered Source Model
by: Wang, Jianyu, et al.
Published: (2024)

Towards Blind Data Cleaning: A Case Study in Music Source Separation
by: Gui, Azalea, et al.
Published: (2025)

Incremental Averaging Method to Improve Graph-Based Time-Difference-of-Arrival Estimation
by: Brümann, Klaus, et al.
Published: (2025)

End-to-End Speech Recognition with Pre-trained Masked Language Model
by: Higuchi, Yosuke, et al.
Published: (2024)

Adapting Speech Foundation Models for Unified Multimodal Speech Recognition with Large Language Models
by: Zhang, Jing-Xuan, et al.
Published: (2025)

Fast Swap-Based Element Selection for Multiplication-Free Dimension Reduction
by: Ono, Nobutaka
Published: (2026)

MOPSA: Mixture of Prompt-Experts Based Speaker Adaptation for Elderly Speech Recognition
by: Deng, Chengxi, et al.
Published: (2025)

Mixture to Beamformed Mixture: Leveraging Beamformed Mixture as Weak-Supervision for Speech Enhancement and Noise-Robust ASR
by: Wang, Zhong-Qiu, et al.
Published: (2025)

Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation
by: Wang, Pu, et al.
Published: (2024)

Cross-Talk Speech Reduction, by Separation, for Separation
by: Wang, Zhong-Qiu, et al.
Published: (2026)

Dynamic Slimmable Networks for Efficient Speech Separation
by: Elminshawi, Mohamed, et al.
Published: (2025)

On the Invariance of Cross-Correlation Peak Positions Under Monotonic Signal Transformations, with Application to Fast Time Difference Estimation
by: Ueno, Natsuki, et al.
Published: (2025)

FNH-TTS: Mixture-of-Experts Duration Modeling for Robust Neural Speech Synthesis
by: Meng, Qingliang, et al.
Published: (2025)

MAGE: A Coarse-to-Fine Speech Enhancer with Masked Generative Model
by: Pham, The Hieu, et al.
Published: (2025)

Direct Preference Optimization for Speech Autoregressive Diffusion Models
by: Liu, Zhijun, et al.
Published: (2025)

Lightweight and Robust Multi-Channel End-to-End Speech Recognition with Spherical Harmonic Transform
by: Kong, Xiangzhu, et al.
Published: (2025)

EDSep: An Effective Diffusion-Based Method for Speech Source Separation
by: Dong, Jinwei, et al.
Published: (2025)

Test-Time Adaptation For Speech Enhancement Via Mask Polarization
by: Raichle, Tobias, et al.
Published: (2026)

Noise-robust Speech Separation with Fast Generative Correction
by: Wang, Helin, et al.
Published: (2024)

Blind Source Separation in Biomedical Signals Using Variational Methods
by: Torabi, Yasaman, et al.
Published: (2025)

A Fast and Lightweight Model for Causal Audio-Visual Speech Separation
by: Sang, Wendi, et al.
Published: (2025)

ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion Prior
by: Xu, Zhongweiyang, et al.
Published: (2025)