Saved in:
| Main Authors: | Stankevich, A., Nechepurenko, I., Shevchenko, A., Gremyachikh, L., Ustyuzhanin, A., Vasyukov, A. |
|---|---|
| Format: | Preprint |
| Published: |
2021
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2110.08626 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Joint Feature and Output Distillation for Low-complexity Acoustic Scene Classification
by: Li, Haowen, et al.
Published: (2025)
by: Li, Haowen, et al.
Published: (2025)
High-resolution closed-loop seismic inversion network in time-frequency phase mixed domain
by: Liu, Yingtian, et al.
Published: (2024)
by: Liu, Yingtian, et al.
Published: (2024)
Noise-Robust Keyword Spotting through Self-supervised Pretraining
by: Mørk, Jacob, et al.
Published: (2024)
by: Mørk, Jacob, et al.
Published: (2024)
Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions
by: Bovbjerg, Holger Severin, et al.
Published: (2023)
by: Bovbjerg, Holger Severin, et al.
Published: (2023)
Learning Robust Spatial Representations from Binaural Audio through Feature Distillation
by: Bovbjerg, Holger Severin, et al.
Published: (2025)
by: Bovbjerg, Holger Severin, et al.
Published: (2025)
Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining
by: Bovbjerg, Holger Severin, et al.
Published: (2025)
by: Bovbjerg, Holger Severin, et al.
Published: (2025)
A Dynamic Learning Observatory Reveals the Rapid Salinization of Satkhira, Bangladesh
by: Sarkar, Showmitra Kumar, et al.
Published: (2026)
by: Sarkar, Showmitra Kumar, et al.
Published: (2026)
The Nash-MTL-STCN Method For Prestack Three-Parameter Inversion
by: Liu, Yingtian, et al.
Published: (2024)
by: Liu, Yingtian, et al.
Published: (2024)
Audio-based Kinship Verification Using Age Domain Conversion
by: Sun, Qiyang, et al.
Published: (2024)
by: Sun, Qiyang, et al.
Published: (2024)
Fast Diffusion Model For Seismic Data Noise Attenuation
by: Peng, Junheng, et al.
Published: (2024)
by: Peng, Junheng, et al.
Published: (2024)
Seismic Data Strong Noise Attenuation Based on Diffusion Model and Principal Component Analysis
by: Peng, Junheng, et al.
Published: (2023)
by: Peng, Junheng, et al.
Published: (2023)
TuneGenie: Reasoning-based LLM agents for preferential music generation
by: Pandey, Amitesh, et al.
Published: (2025)
by: Pandey, Amitesh, et al.
Published: (2025)
KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods
by: Nzeyimana, Antoine
Published: (2023)
by: Nzeyimana, Antoine
Published: (2023)
Three-dimensional inversion of gravity data using implicit neural representations and scientific machine learning
by: Mishra, Pankaj K, et al.
Published: (2025)
by: Mishra, Pankaj K, et al.
Published: (2025)
Passive Underwater Acoustic Signal Separation based on Feature Decoupling Dual-path Network
by: Liu, Yucheng, et al.
Published: (2025)
by: Liu, Yucheng, et al.
Published: (2025)
Experimental Study: Enhancing Voice Spoofing Detection Models with wav2vec 2.0
by: Kang, Taein, et al.
Published: (2024)
by: Kang, Taein, et al.
Published: (2024)
Rethinking Masking Strategies for Masked Prediction-based Audio Self-supervised Learning
by: Niizumi, Daisuke, et al.
Published: (2026)
by: Niizumi, Daisuke, et al.
Published: (2026)
Beyond Deep Learning: Speech Segmentation and Phone Classification with Neural Assemblies
by: Adelson, Trevor, et al.
Published: (2026)
by: Adelson, Trevor, et al.
Published: (2026)
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models
by: Ferreira, Alexandre R., et al.
Published: (2023)
by: Ferreira, Alexandre R., et al.
Published: (2023)
Quantum-Enhanced Analysis and Grading of Vocal Performance
by: Agarwal, Rohan
Published: (2025)
by: Agarwal, Rohan
Published: (2025)
Modeling L1 Influence on L2 Pronunciation: An MFCC-Based Framework for Explainable Machine Learning and Pedagogical Feedback
by: Jahanbin, Peyman
Published: (2025)
by: Jahanbin, Peyman
Published: (2025)
Analyzing and Exploring Training Recipes for Large-Scale Transformer-Based Weather Prediction
by: Willard, Jared D., et al.
Published: (2024)
by: Willard, Jared D., et al.
Published: (2024)
A Multimodal Symphony: Integrating Taste and Sound through Generative AI
by: Spanio, Matteo, et al.
Published: (2025)
by: Spanio, Matteo, et al.
Published: (2025)
GraFPrint: A GNN-Based Approach for Audio Identification
by: Bhattacharjee, Aditya, et al.
Published: (2024)
by: Bhattacharjee, Aditya, et al.
Published: (2024)
Audio Foundation Models Outperform Symbolic Representations for Piano Performance Evaluation
by: Dhiman, Jai
Published: (2026)
by: Dhiman, Jai
Published: (2026)
Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation
by: Bhattacharjee, Aditya, et al.
Published: (2025)
by: Bhattacharjee, Aditya, et al.
Published: (2025)
Fine-tuning Pre-trained Audio Models for COVID-19 Detection: A Technical Report
by: de Brito, Daniel Oliveira, et al.
Published: (2025)
by: de Brito, Daniel Oliveira, et al.
Published: (2025)
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering
by: Aristorenas, Aris J.
Published: (2024)
by: Aristorenas, Aris J.
Published: (2024)
Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network
by: He, Zhanhong, et al.
Published: (2025)
by: He, Zhanhong, et al.
Published: (2025)
ChordSync: Conformer-Based Alignment of Chord Annotations to Music Audio
by: Poltronieri, Andrea, et al.
Published: (2024)
by: Poltronieri, Andrea, et al.
Published: (2024)
Symbolic Audio Classification via Modal Decision Tree Learning
by: Marzano, Enrico, et al.
Published: (2025)
by: Marzano, Enrico, et al.
Published: (2025)
Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech
by: Mehta, Shivam, et al.
Published: (2024)
by: Mehta, Shivam, et al.
Published: (2024)
Detection and Classification of Cetacean Echolocation Clicks using Image-based Object Detection Methods applied to Advanced Wavelet-based Transformations
by: Hauer, Christopher
Published: (2026)
by: Hauer, Christopher
Published: (2026)
Make Some Noise: Towards LLM audio reasoning and generation using sound tokens
by: Mehta, Shivam, et al.
Published: (2025)
by: Mehta, Shivam, et al.
Published: (2025)
Local Diagnostics of Continuous Normalizing Flow for Out-of-Distribution Detection
by: Cao, Xinwei, et al.
Published: (2026)
by: Cao, Xinwei, et al.
Published: (2026)
Decoding Phone Pairs from MEG Signals Across Speech Modalities
by: de Zuazo, Xabier, et al.
Published: (2025)
by: de Zuazo, Xabier, et al.
Published: (2025)
SemAlignVC: Enhancing zero-shot timbre conversion using semantic alignment
by: Mehta, Shivam, et al.
Published: (2025)
by: Mehta, Shivam, et al.
Published: (2025)
Accurate typhoon intensity forecasts using a non-iterative spatiotemporal transformer model
by: Qu, Hongyu, et al.
Published: (2025)
by: Qu, Hongyu, et al.
Published: (2025)
A Dual-TransUNet Deep Learning Framework for Multi-Source Precipitation Merging and Improving Seasonal and Extreme Estimates
by: Ye, Yuchen, et al.
Published: (2026)
by: Ye, Yuchen, et al.
Published: (2026)
SigWavNet: Learning Multiresolution Signal Wavelet Network for Speech Emotion Recognition
by: Nfissi, Alaa, et al.
Published: (2025)
by: Nfissi, Alaa, et al.
Published: (2025)
Similar Items
-
Joint Feature and Output Distillation for Low-complexity Acoustic Scene Classification
by: Li, Haowen, et al.
Published: (2025) -
High-resolution closed-loop seismic inversion network in time-frequency phase mixed domain
by: Liu, Yingtian, et al.
Published: (2024) -
Noise-Robust Keyword Spotting through Self-supervised Pretraining
by: Mørk, Jacob, et al.
Published: (2024) -
Self-supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions
by: Bovbjerg, Holger Severin, et al.
Published: (2023) -
Learning Robust Spatial Representations from Binaural Audio through Feature Distillation
by: Bovbjerg, Holger Severin, et al.
Published: (2025)