:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Grinberg, Petr, Kumar, Ankur, Koppisetti, Surya, Bharaj, Gaurav
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2506.03425
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

What Does an Audio Deepfake Detector Focus on? A Study in the Time Domain
by: Grinberg, Petr, et al.
Published: (2025)

SLIM: Style-Linguistics Mismatch Model for Generalized Audio Deepfake Detection
by: Zhu, Yi, et al.
Published: (2024)

Towards Attention-based Contrastive Learning for Audio Spoof Detection
by: Goel, Chirag, et al.
Published: (2024)

Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
by: Zhu, Yi, et al.
Published: (2024)

A SUPERB-Style Benchmark of Self-Supervised Speech Models for Audio Deepfake Detection
by: Ali, Hashim, et al.
Published: (2026)

AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
by: Oorloff, Trevine, et al.
Published: (2024)

ICLAD: In-Context Learning with Comparison-Guidance for Audio Deepfake Detection
by: Chou, Benjamin, et al.
Published: (2026)

Similarity Choice and Negative Scaling in Supervised Contrastive Learning for Deepfake Audio Detection
by: Sudan, Jaskirat, et al.
Published: (2026)

Content and Style Aware Audio-Driven Facial Animation
by: Liu, Qingju, et al.
Published: (2024)

A Knowledge-Driven Approach to Music Segmentation, Music Source Separation and Cinematic Audio Source Separation
by: Ho, Chun-wei, et al.
Published: (2026)

Lightweight Joint Audio-Visual Deepfake Detection via Single-Stream Multi-Modal Learning Framework
by: Zhang, Kuiyuan, et al.
Published: (2025)

LMAC-TD: Producing Time Domain Explanations for Audio Classifiers
by: Mancini, Eleonora, et al.
Published: (2024)

Rehearsal with Auxiliary-Informed Sampling for Audio Deepfake Detection
by: Febrinanto, Falih Gozi, et al.
Published: (2025)

Distilling Conversations: Abstract Compression of Conversational Audio Context for LLM-based ASR
by: Kumar, Shashi, et al.
Published: (2026)

XMAD-Bench: Cross-Domain Multilingual Audio Deepfake Benchmark
by: Ciobanu, Ioan-Paul, et al.
Published: (2025)

Multilingual Dataset Integration Strategies for Robust Audio Deepfake Detection: A SAFE Challenge System
by: Ali, Hashim, et al.
Published: (2025)

Diffused Responsibility: Analyzing the Energy Consumption of Generative Text-to-Audio Diffusion Models
by: Passoni, Riccardo, et al.
Published: (2025)

Prompt-guided Precise Audio Editing with Diffusion Models
by: Xu, Manjie, et al.
Published: (2024)

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
by: Xie, Yuankun, et al.
Published: (2024)

LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio
by: Vakada, Naveen, et al.
Published: (2026)

Alethia: A Foundational Encoder for Voice Deepfakes
by: Zhu, Yi, et al.
Published: (2026)

Leveraging Whisper Embeddings for Audio-based Lyrics Matching
by: Mancini, Eleonora, et al.
Published: (2025)

Multimodal Audio-based Disease Prediction with Transformer-based Hierarchical Fusion Network
by: Cai, Jinjin, et al.
Published: (2024)

SoundBreak: A Systematic Study of Audio-Only Adversarial Attacks on Trimodal Models
by: Hussain, Aafiya, et al.
Published: (2026)

Guiding Audio Editing with Audio Language Model
by: Lan, Zitong, et al.
Published: (2025)

Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models
by: Wright, Alec, et al.
Published: (2024)

Tuning In: Analysis of Audio Classifier Performance in Clinical Settings with Limited Data
by: Mahdi, Hamza, et al.
Published: (2024)

Unbiased Sliced Wasserstein Kernels for High-Quality Audio Captioning
by: Luong, Manh, et al.
Published: (2025)

IndieFake Dataset: A Benchmark Dataset for Audio Deepfake Detection
by: Kumar, Abhay, et al.
Published: (2025)

PoDAR: Power-Disentangled Audio Representation for Generative Modeling
by: Luebs, Alejandro, et al.
Published: (2026)

AudioGenX: Explainability on Text-to-Audio Generative Models
by: Kang, Hyunju, et al.
Published: (2025)

Efficient Parallel Audio Generation using Group Masked Language Modeling
by: Jeong, Myeonghun, et al.
Published: (2024)

Prompt Tuning for Audio Deepfake Detection: Computationally Efficient Test-time Domain Adaptation with Limited Target Dataset
by: Oiso, Hideyuki, et al.
Published: (2024)

Targeted Augmented Data for Audio Deepfake Detection
by: Astrid, Marcella, et al.
Published: (2024)

$\texttt{AVROBUSTBENCH}$: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
by: Maharana, Sarthak Kumar, et al.
Published: (2025)

JSQA: Speech Quality Assessment with Perceptually-Inspired Contrastive Pretraining Based on JND Audio Pairs
by: Fan, Junyi, et al.
Published: (2025)

Multi-Microphone Speech Emotion Recognition using the Hierarchical Token-semantic Audio Transformer Architecture
by: Cohen, Ohad, et al.
Published: (2024)

A Survey of Deep Learning Audio Generation Methods
by: Božić, Matej, et al.
Published: (2024)

Codecfake: An Initial Dataset for Detecting LLM-based Deepfake Audio
by: Lu, Yi, et al.
Published: (2024)

What Counts as Real? Speech Restoration and Voice Quality Conversion Pose New Challenges to Deepfake Detection
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)