Saved in:
| Main Authors: | Wang, Chu, Wu, Jinhong, Wang, Yanzhi, Zha, Zhijian, Zhou, Qi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.01132 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
by: Yokota, Kazuya, et al.
Published: (2023)
by: Yokota, Kazuya, et al.
Published: (2023)
KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge
by: Yu, Guochen, et al.
Published: (2024)
by: Yu, Guochen, et al.
Published: (2024)
A k-space approach to modeling multi-channel parametric array loudspeaker systems
by: Zhuang, Tao, et al.
Published: (2025)
by: Zhuang, Tao, et al.
Published: (2025)
Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets
by: Moussa, Denise, et al.
Published: (2023)
by: Moussa, Denise, et al.
Published: (2023)
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional Encoding for Single- and Multi-Channel Speaker Separation
by: Kalkhorani, Vahid Ahmadi, et al.
Published: (2024)
by: Kalkhorani, Vahid Ahmadi, et al.
Published: (2024)
HingeNet: A Harmonic-Aware Fine-Tuning Approach for Beat Tracking
by: Ru, Ganghui, et al.
Published: (2025)
by: Ru, Ganghui, et al.
Published: (2025)
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR
by: Wang, Jinhan, et al.
Published: (2024)
by: Wang, Jinhan, et al.
Published: (2024)
The Arrow of Time in Music -- Revisiting the Temporal Structure of Music with Distinguishability and Unique Orientability as the Anchor Point
by: Xu, Qi
Published: (2023)
by: Xu, Qi
Published: (2023)
The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge
by: Zhou, Yixuan, et al.
Published: (2024)
by: Zhou, Yixuan, et al.
Published: (2024)
EZhouNet:A framework based on graph neural network and anchor interval for the respiratory sound event detection
by: Chu, Yun, et al.
Published: (2025)
by: Chu, Yun, et al.
Published: (2025)
Multi-Scale Accent Modeling and Disentangling for Multi-Speaker Multi-Accent Text-to-Speech Synthesis
by: Zhou, Xuehao, et al.
Published: (2024)
by: Zhou, Xuehao, et al.
Published: (2024)
Phoneme-based speech recognition driven by large language models and sampling marginalization
by: Ma, Te, et al.
Published: (2025)
by: Ma, Te, et al.
Published: (2025)
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
by: Zhao, Wenbo, et al.
Published: (2024)
by: Zhao, Wenbo, et al.
Published: (2024)
GMM-ResNet2: Ensemble of Group ResNet Networks for Synthetic Speech Detection
by: Lei, Zhenchun, et al.
Published: (2024)
by: Lei, Zhenchun, et al.
Published: (2024)
UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic information
by: Wang, Rui, et al.
Published: (2025)
by: Wang, Rui, et al.
Published: (2025)
Theory and investigation of acoustic multiple-input multiple-output systems based on spherical arrays in a room
by: Morgenstern, Hai, et al.
Published: (2024)
by: Morgenstern, Hai, et al.
Published: (2024)
PhiNet: Speaker Verification with Phonetic Interpretability
by: Ma, Yi, et al.
Published: (2026)
by: Ma, Yi, et al.
Published: (2026)
SingNet: Towards a Large-Scale, Diverse, and In-the-Wild Singing Voice Dataset
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Phoenix-VAD: Streaming Semantic Endpoint Detection for Full-Duplex Speech Interaction
by: Wu, Weijie, et al.
Published: (2025)
by: Wu, Weijie, et al.
Published: (2025)
PrimeK-Net: Multi-scale Spectral Learning via Group Prime-Kernel Convolutional Neural Networks for Single Channel Speech Enhancement
by: Lin, Zizhen, et al.
Published: (2025)
by: Lin, Zizhen, et al.
Published: (2025)
Physics-Informed Machine Learning For Sound Field Estimation
by: Koyama, Shoichi, et al.
Published: (2024)
by: Koyama, Shoichi, et al.
Published: (2024)
Identification of Physical Properties in Acoustic Tubes Using Physics-Informed Neural Networks
by: Yokota, Kazuya, et al.
Published: (2024)
by: Yokota, Kazuya, et al.
Published: (2024)
Improving Real-Time Music Accompaniment Separation with MMDenseNet
by: Wang, Chun-Hsiang, et al.
Published: (2024)
by: Wang, Chun-Hsiang, et al.
Published: (2024)
MSU-Bench: Towards Understanding the Conversational Multi-talker Scenarios
by: Wang, Shuai, et al.
Published: (2025)
by: Wang, Shuai, et al.
Published: (2025)
WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
by: Zhou, Junzuo, et al.
Published: (2024)
by: Zhou, Junzuo, et al.
Published: (2024)
E2E-AEC: Implementing an end-to-end neural network learning approach for acoustic echo cancellation
by: Jiang, Yiheng, et al.
Published: (2026)
by: Jiang, Yiheng, et al.
Published: (2026)
ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics
by: Oh, Heewon
Published: (2026)
by: Oh, Heewon
Published: (2026)
Deep learning classification system for coconut maturity levels based on acoustic signals
by: Caladcad, June Anne, et al.
Published: (2024)
by: Caladcad, June Anne, et al.
Published: (2024)
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction
by: Chen, Xueyuan, et al.
Published: (2024)
by: Chen, Xueyuan, et al.
Published: (2024)
A robust audio deepfake detection system via multi-view feature
by: Yang, Yujie, et al.
Published: (2024)
by: Yang, Yujie, et al.
Published: (2024)
Communication conditions in virtual acoustic scenes in an underground station
by: Hládek, Ľuboš, et al.
Published: (2021)
by: Hládek, Ľuboš, et al.
Published: (2021)
Robust DOA estimation using deep acoustic imaging
by: Roman, Adrian S., et al.
Published: (2024)
by: Roman, Adrian S., et al.
Published: (2024)
Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration
by: Li, Haowen, et al.
Published: (2026)
by: Li, Haowen, et al.
Published: (2026)
Multi-Step Prediction and Control of Hierarchical Emotion Distribution in Text-to-Speech Synthesis
by: Inoue, Sho, et al.
Published: (2025)
by: Inoue, Sho, et al.
Published: (2025)
A toolbox for rendering virtual acoustic environments in the context of audiology
by: Grimm, Giso, et al.
Published: (2018)
by: Grimm, Giso, et al.
Published: (2018)
Guiding the underwater acoustic target recognition with interpretable contrastive learning
by: Xie, Yuan, et al.
Published: (2024)
by: Xie, Yuan, et al.
Published: (2024)
Learning Vocal-Tract Area and Radiation with a Physics-Informed Webster Model
by: Lu, Minhui, et al.
Published: (2026)
by: Lu, Minhui, et al.
Published: (2026)
In This Environment, As That Speaker: A Text-Driven Framework for Multi-Attribute Speech Conversion
by: Jin, Jiawei, et al.
Published: (2025)
by: Jin, Jiawei, et al.
Published: (2025)
Phase-Retrieval-Based Physics-Informed Neural Networks For Acoustic Magnitude Field Reconstruction
by: Schrader, Karl, et al.
Published: (2026)
by: Schrader, Karl, et al.
Published: (2026)
Unsupervised Multi-channel Speech Dereverberation via Diffusion
by: Wu, Yulun, et al.
Published: (2025)
by: Wu, Yulun, et al.
Published: (2025)
Similar Items
-
Physics-informed neural network for acoustic resonance analysis in a one-dimensional acoustic tube
by: Yokota, Kazuya, et al.
Published: (2023) -
KS-Net: Multi-band joint speech restoration and enhancement network for 2024 ICASSP SSI Challenge
by: Yu, Guochen, et al.
Published: (2024) -
A k-space approach to modeling multi-channel parametric array loudspeaker systems
by: Zhuang, Tao, et al.
Published: (2025) -
Point to the Hidden: Exposing Speech Audio Splicing via Signal Pointer Nets
by: Moussa, Denise, et al.
Published: (2023) -
CrossNet: Leveraging Global, Cross-Band, Narrow-Band, and Positional Encoding for Single- and Multi-Channel Speaker Separation
by: Kalkhorani, Vahid Ahmadi, et al.
Published: (2024)