:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Dutta, Bikash, Ranjan, Rishabh, Sathvik, Shyam, Vatsa, Mayank, Singh, Richa
Format:	Preprint
Published:	2025
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2506.06756
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multimodal Zero-Shot Framework for Deepfake Hate Speech Detection in Low-Resource Languages
by: Ranjan, Rishabh, et al.
Published: (2025)

SynHate: Detecting Hate Speech in Synthetic Deepfake Audio
by: Ranjan, Rishabh, et al.
Published: (2025)

Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics
by: Pathak, Shreyansh, et al.
Published: (2025)

Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
by: Dutta, Soumya, et al.
Published: (2024)

ALDAS: Audio-Linguistic Data Augmentation for Spoofed Audio Detection
by: Khanjani, Zahra, et al.
Published: (2024)

MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
by: Müller, Nicolas M., et al.
Published: (2024)

Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training
by: Udupa, Sathvik, et al.
Published: (2025)

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?
by: Liu, Tianchi, et al.
Published: (2024)

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
by: Zhang, Lin, et al.
Published: (2024)

Zero-Shot Fake Video Detection by Audio-Visual Consistency
by: Li, Xiaolou, et al.
Published: (2024)

PAT: Parameter-Free Audio-Text Aligner to Boost Zero-Shot Audio Classification
by: Seth, Ashish, et al.
Published: (2024)

Zero-Shot Audio Captioning Using Soft and Hard Prompts
by: Zhang, Yiming, et al.
Published: (2024)

Pengi: An Audio Language Model for Audio Tasks
by: Deshmukh, Soham, et al.
Published: (2023)

Zero-Shot Parkinson's Disease Detection from Speech: Comparing Large Audio and Language Models
by: Kabir, Muhammad Ashad, et al.
Published: (2026)

Augmentation through Laundering Attacks for Audio Spoof Detection
by: Ali, Hashim, et al.
Published: (2024)

Do Music Source Separation Models Preserve Spatial Information in Binaural Audio?
by: Namballa, Richa, et al.
Published: (2025)

Can Large Language Models Understand Spatial Audio?
by: Tang, Changli, et al.
Published: (2024)

Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
by: Dixit, Satvik, et al.
Published: (2024)

Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation
by: Yang, Mu, et al.
Published: (2024)

Generalizable Audio Spoofing Detection using Non-Semantic Representations
by: Das, Arnab, et al.
Published: (2025)

Towards Attention-based Contrastive Learning for Audio Spoof Detection
by: Goel, Chirag, et al.
Published: (2024)

Interpretable Temporal Class Activation Representation for Audio Spoofing Detection
by: Li, Menglu, et al.
Published: (2024)

When Spoof Detectors Travel: Evaluation Across 66 Languages in the Low-Resource Language Spoofing Corpus
by: Borodin, Kirill, et al.
Published: (2026)

Can Audio Large Language Models Verify Speaker Identity?
by: Ren, Yiming, et al.
Published: (2025)

Domain Adaptation for Contrastive Audio-Language Models
by: Deshmukh, Soham, et al.
Published: (2024)

PAM: Prompting Audio-Language Models for Audio Quality Assessment
by: Deshmukh, Soham, et al.
Published: (2024)

CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems
by: Wu, Haibin, et al.
Published: (2024)

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
by: Liu, Xuechen, et al.
Published: (2024)

Everyone-Can-Sing: Zero-Shot Singing Voice Synthesis and Conversion with Speech Reference
by: Dai, Shuqi, et al.
Published: (2025)

CompSpoof: A Dataset and Joint Learning Framework for Component-Level Audio Anti-spoofing Countermeasures
by: Zhang, Xueping, et al.
Published: (2025)

SilentCipher: Deep Audio Watermarking
by: Singh, Mayank Kumar, et al.
Published: (2024)

LJ-Spoof: A Generatively Varied Corpus for Audio Anti-Spoofing and Synthesis Source Tracing
by: Subramani, Surya, et al.
Published: (2026)

AVR: Synergizing Foundation Models for Audio-Visual Humor Detection
by: Sharma, Sarthak, et al.
Published: (2024)

Post-Training Quantization for Audio Diffusion Transformers
by: Khandelwal, Tanmay, et al.
Published: (2025)

Investigating Causal Cues: Strengthening Spoofed Audio Detection with Human-Discernible Linguistic Features
by: Khanjani, Zahra, et al.
Published: (2024)

XLSR-Mamba: A Dual-Column Bidirectional State Space Model for Spoofing Attack Detection
by: Xiao, Yang, et al.
Published: (2024)

MiMo-Audio: Audio Language Models are Few-Shot Learners
by: Core Team, et al.
Published: (2025)

Hierarchical Decoding for Discrete Speech Synthesis with Multi-Resolution Spoof Detection
by: Zhao, Junchuan, et al.
Published: (2026)

CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech
by: Kim, Jaehyeon, et al.
Published: (2024)

Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models
by: Xu, Xuenan, et al.
Published: (2024)