Saved in:
| Main Authors: | Soumik, Mohd. Farhan Israk, Mithsara, W. K. M., Shahid, Abdur R., Imteaj, Ahmed |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.18727 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WeDefense: A Toolkit to Defend Against Fake Audio
by: Zhang, Lin, et al.
Published: (2026)
by: Zhang, Lin, et al.
Published: (2026)
Exploring Finetuned Audio-LLM on Heart Murmur Features
by: Florea, Adrian, et al.
Published: (2025)
by: Florea, Adrian, et al.
Published: (2025)
Replay Attacks Against Audio Deepfake Detection
by: Müller, Nicolas, et al.
Published: (2025)
by: Müller, Nicolas, et al.
Published: (2025)
SemanticAudio: Audio Generation and Editing in Semantic Space
by: Dai, Zheqi, et al.
Published: (2026)
by: Dai, Zheqi, et al.
Published: (2026)
Are Mamba-based Audio Foundation Models the Best Fit for Non-Verbal Emotion Recognition?
by: Akhtar, Mohd Mujtaba, et al.
Published: (2025)
by: Akhtar, Mohd Mujtaba, et al.
Published: (2025)
Feature Selection via Graph Topology Inference for Soundscape Emotion Recognition
by: Rey, Samuel, et al.
Published: (2025)
by: Rey, Samuel, et al.
Published: (2025)
Exploring Differences between Human Perception and Model Inference in Audio Event Recognition
by: Tan, Yizhou, et al.
Published: (2024)
by: Tan, Yizhou, et al.
Published: (2024)
Yours or Mine? Overwriting Attacks Against Neural Audio Watermarking
by: Yao, Lingfeng, et al.
Published: (2025)
by: Yao, Lingfeng, et al.
Published: (2025)
AudioEditor: A Training-Free Diffusion-Based Audio Editing Framework
by: Jia, Yuhang, et al.
Published: (2024)
by: Jia, Yuhang, et al.
Published: (2024)
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
by: Wu, Shih-Lun, et al.
Published: (2023)
by: Wu, Shih-Lun, et al.
Published: (2023)
Audio-Language Models for Audio-Centric Tasks: A Systematic Survey
by: Su, Yi, et al.
Published: (2025)
by: Su, Yi, et al.
Published: (2025)
Towards Privacy-Preserving Audio Classification Systems
by: Chhaglani, Bhawana, et al.
Published: (2024)
by: Chhaglani, Bhawana, et al.
Published: (2024)
Fine-Grained Quantitative Emotion Editing for Speech Generation
by: Inoue, Sho, et al.
Published: (2024)
by: Inoue, Sho, et al.
Published: (2024)
Defense Against Synthetic Speech: Real-Time Detection of RVC Voice Conversion Attacks
by: Chinchmalatpure, Prajwal, et al.
Published: (2025)
by: Chinchmalatpure, Prajwal, et al.
Published: (2025)
Weighted-Sampling Audio Adversarial Example Attack
by: Liu, Xiaolei, et al.
Published: (2019)
by: Liu, Xiaolei, et al.
Published: (2019)
Transferable Adversarial Attacks on Audio Deepfake Detection
by: Farooq, Muhammad Umar, et al.
Published: (2025)
by: Farooq, Muhammad Umar, et al.
Published: (2025)
Towards Neural Audio Codec Source Parsing
by: Phukan, Orchid Chetia, et al.
Published: (2025)
by: Phukan, Orchid Chetia, et al.
Published: (2025)
SALM: Spatial Audio Language Model with Structured Embeddings for Understanding and Editing
by: Hu, Jinbo, et al.
Published: (2025)
by: Hu, Jinbo, et al.
Published: (2025)
EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022)
by: Zhao, Yan, et al.
Published: (2022)
Rethinking Continual Learning for Speech and Audio: A Representation-Centric Taxonomy and Open Problems
by: Xiao, Yang, et al.
Published: (2026)
by: Xiao, Yang, et al.
Published: (2026)
Improving Audio Question Answering with Variational Inference
by: Chen, Haolin
Published: (2026)
by: Chen, Haolin
Published: (2026)
EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
by: Li, Pengcheng, et al.
Published: (2025)
by: Li, Pengcheng, et al.
Published: (2025)
Exploring Perceptual Audio Quality Measurement on Stereo Processing Using the Open Dataset of Audio Quality
by: Delgado, Pablo M., et al.
Published: (2025)
by: Delgado, Pablo M., et al.
Published: (2025)
Study of Pre-processing Defenses against Adversarial Attacks on State-of-the-art Speaker Recognition Systems
by: Joshi, Sonal, et al.
Published: (2021)
by: Joshi, Sonal, et al.
Published: (2021)
Interpretable Audio Editing Evaluation via Chain-of-Thought Difference-Commonality Reasoning with Multimodal LLMs
by: Jia, Yuhang, et al.
Published: (2025)
by: Jia, Yuhang, et al.
Published: (2025)
Audio-Visual Speech Enhancement in Noisy Environments via Emotion-Based Contextual Cues
by: Hussain, Tassadaq, et al.
Published: (2024)
by: Hussain, Tassadaq, et al.
Published: (2024)
Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)
by: Jin, Zhan, et al.
Published: (2025)
Prediction of Spotify Chart Success Using Audio and Streaming Features
by: Cabansag, Ian Jacob, et al.
Published: (2025)
by: Cabansag, Ian Jacob, et al.
Published: (2025)
Audio Conditioning for Music Generation via Discrete Bottleneck Features
by: Rouard, Simon, et al.
Published: (2024)
by: Rouard, Simon, et al.
Published: (2024)
Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
by: Dutta, Soumya, et al.
Published: (2024)
by: Dutta, Soumya, et al.
Published: (2024)
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
by: Tan, Chao, et al.
Published: (2024)
by: Tan, Chao, et al.
Published: (2024)
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
by: Rouditchenko, Andrew, et al.
Published: (2025)
by: Rouditchenko, Andrew, et al.
Published: (2025)
A Noval Feature via Color Quantisation for Fake Audio Detection
by: Wang, Zhiyong, et al.
Published: (2024)
by: Wang, Zhiyong, et al.
Published: (2024)
Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
by: Zhang, Xiaohui, et al.
Published: (2024)
by: Zhang, Xiaohui, et al.
Published: (2024)
Adversarial Attacks and Defenses for Speech Recognition Systems
by: Żelasko, Piotr, et al.
Published: (2021)
by: Żelasko, Piotr, et al.
Published: (2021)
Comparative Evaluation of Text and Audio Simplification: A Methodological Replication Study
by: Barai, Prosanta, et al.
Published: (2025)
by: Barai, Prosanta, et al.
Published: (2025)
ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization
by: Steinmetz, Christian J., et al.
Published: (2024)
by: Steinmetz, Christian J., et al.
Published: (2024)
Audio Editing with Non-Rigid Text Prompts
by: Paissan, Francesco, et al.
Published: (2023)
by: Paissan, Francesco, et al.
Published: (2023)
Exploring Self-Supervised Audio Models for Generalized Anomalous Sound Detection
by: Han, Bing, et al.
Published: (2025)
by: Han, Bing, et al.
Published: (2025)
Exploring Text-Queried Sound Event Detection with Audio Source Separation
by: Yin, Han, et al.
Published: (2024)
by: Yin, Han, et al.
Published: (2024)
Similar Items
-
WeDefense: A Toolkit to Defend Against Fake Audio
by: Zhang, Lin, et al.
Published: (2026) -
Exploring Finetuned Audio-LLM on Heart Murmur Features
by: Florea, Adrian, et al.
Published: (2025) -
Replay Attacks Against Audio Deepfake Detection
by: Müller, Nicolas, et al.
Published: (2025) -
SemanticAudio: Audio Generation and Editing in Semantic Space
by: Dai, Zheqi, et al.
Published: (2026) -
Are Mamba-based Audio Foundation Models the Best Fit for Non-Verbal Emotion Recognition?
by: Akhtar, Mohd Mujtaba, et al.
Published: (2025)