Saved in:
| Main Authors: | Dutta, Bikash, Ranjan, Rishabh, Sathvik, Shyam, Vatsa, Mayank, Singh, Richa |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.06756 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multimodal Zero-Shot Framework for Deepfake Hate Speech Detection in Low-Resource Languages
by: Ranjan, Rishabh, et al.
Published: (2025)
by: Ranjan, Rishabh, et al.
Published: (2025)
SynHate: Detecting Hate Speech in Synthetic Deepfake Audio
by: Ranjan, Rishabh, et al.
Published: (2025)
by: Ranjan, Rishabh, et al.
Published: (2025)
Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics
by: Pathak, Shreyansh, et al.
Published: (2025)
by: Pathak, Shreyansh, et al.
Published: (2025)
Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
by: Dutta, Soumya, et al.
Published: (2024)
by: Dutta, Soumya, et al.
Published: (2024)
ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)
by: Khanjani, Zahra, et al.
Published: (2024)
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
by: Müller, Nicolas M., et al.
Published: (2024)
by: Müller, Nicolas M., et al.
Published: (2024)
Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training
by: Udupa, Sathvik, et al.
Published: (2025)
by: Udupa, Sathvik, et al.
Published: (2025)
How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?
by: Liu, Tianchi, et al.
Published: (2024)
by: Liu, Tianchi, et al.
Published: (2024)
Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
by: Zhang, Lin, et al.
Published: (2024)
by: Zhang, Lin, et al.
Published: (2024)
Zero-Shot Fake Video Detection by Audio-Visual Consistency
by: Li, Xiaolou, et al.
Published: (2024)
by: Li, Xiaolou, et al.
Published: (2024)
PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification
by: Seth, Ashish, et al.
Published: (2024)
by: Seth, Ashish, et al.
Published: (2024)
Zero-Shot Audio Captioning Using Soft and Hard Prompts
by: Zhang, Yiming, et al.
Published: (2024)
by: Zhang, Yiming, et al.
Published: (2024)
Pengi: An Audio Language Model for Audio Tasks
by: Deshmukh, Soham, et al.
Published: (2023)
by: Deshmukh, Soham, et al.
Published: (2023)
Zero-Shot Parkinson's Disease Detection from Speech: Comparing Large Audio and Language Models
by: Kabir, Muhammad Ashad, et al.
Published: (2026)
by: Kabir, Muhammad Ashad, et al.
Published: (2026)
Augmentation through Laundering Attacks for Audio Spoof Detection
by: Ali, Hashim, et al.
Published: (2024)
by: Ali, Hashim, et al.
Published: (2024)
Do Music Source Separation Models Preserve Spatial Information in Binaural Audio?
by: Namballa, Richa, et al.
Published: (2025)
by: Namballa, Richa, et al.
Published: (2025)
Can Large Language Models Understand Spatial Audio?
by: Tang, Changli, et al.
Published: (2024)
by: Tang, Changli, et al.
Published: (2024)
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation
by: Yang, Mu, et al.
Published: (2024)
by: Yang, Mu, et al.
Published: (2024)
Generalizable Audio Spoofing Detection using Non-Semantic Representations
by: Das, Arnab, et al.
Published: (2025)
by: Das, Arnab, et al.
Published: (2025)
Towards Attention-based Contrastive Learning for Audio Spoof Detection
by: Goel, Chirag, et al.
Published: (2024)
by: Goel, Chirag, et al.
Published: (2024)
Interpretable Temporal Class Activation Representation for Audio Spoofing Detection
by: Li, Menglu, et al.
Published: (2024)
by: Li, Menglu, et al.
Published: (2024)
When Spoof Detectors Travel: Evaluation Across 66 Languages in the Low-Resource Language Spoofing Corpus
by: Borodin, Kirill, et al.
Published: (2026)
by: Borodin, Kirill, et al.
Published: (2026)
Can Audio Large Language Models Verify Speaker Identity?
by: Ren, Yiming, et al.
Published: (2025)
by: Ren, Yiming, et al.
Published: (2025)
Domain Adaptation for Contrastive Audio-Language Models
by: Deshmukh, Soham, et al.
Published: (2024)
by: Deshmukh, Soham, et al.
Published: (2024)
PAM: Prompting Audio-Language Models for Audio Quality Assessment
by: Deshmukh, Soham, et al.
Published: (2024)
by: Deshmukh, Soham, et al.
Published: (2024)
CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025)
by: Dai, Shuqi, et al.
Published: (2025)
CompSpoof: A Dataset and Joint Learning Framework for Component-Level Audio Anti-spoofing Countermeasures
by: Zhang, Xueping, et al.
Published: (2025)
by: Zhang, Xueping, et al.
Published: (2025)
SilentCipher: Deep Audio Watermarking
by: Singh, Mayank Kumar, et al.
Published: (2024)
by: Singh, Mayank Kumar, et al.
Published: (2024)
LJ-Spoof: A Generatively Varied Corpus for Audio Anti-Spoofing and Synthesis Source Tracing
by: Subramani, Surya, et al.
Published: (2026)
by: Subramani, Surya, et al.
Published: (2026)
AVR: Synergizing Foundation Models for Audio-Visual Humor Detection
by: Sharma, Sarthak, et al.
Published: (2024)
by: Sharma, Sarthak, et al.
Published: (2024)
Post-Training Quantization for Audio Diffusion Transformers
by: Khandelwal, Tanmay, et al.
Published: (2025)
by: Khandelwal, Tanmay, et al.
Published: (2025)
Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features
by: Khanjani, Zahra, et al.
Published: (2024)
by: Khanjani, Zahra, et al.
Published: (2024)
XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
MiMo-Audio: Audio Language Models are Few-Shot Learners
by: Core Team, et al.
Published: (2025)
by: Core Team, et al.
Published: (2025)
Hierarchical Decoding for Discrete Speech Synthesis with Multi-Resolution Spoof Detection
by: Zhao, Junchuan, et al.
Published: (2026)
by: Zhao, Junchuan, et al.
Published: (2026)
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech
by: Kim, Jaehyeon, et al.
Published: (2024)
by: Kim, Jaehyeon, et al.
Published: (2024)
Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models
by: Xu, Xuenan, et al.
Published: (2024)
by: Xu, Xuenan, et al.
Published: (2024)
Similar Items
-
Multimodal Zero-Shot Framework for Deepfake Hate Speech Detection in Low-Resource Languages
by: Ranjan, Rishabh, et al.
Published: (2025) -
SynHate: Detecting Hate Speech in Synthetic Deepfake Audio
by: Ranjan, Rishabh, et al.
Published: (2025) -
Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics
by: Pathak, Shreyansh, et al.
Published: (2025) -
Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
by: Dutta, Soumya, et al.
Published: (2024) -
ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)