:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Li, Jialun, Jiang, Weitao, Cui, Ziyun, Duan, Yinan, Qu, Diyang, Zhang, Chao, Chen, Runsen, Lei, Chang, Wu, Wen
Format:	Preprint
Publié:	2025
Sujets:	Audio and Speech Processing
Accès en ligne:	https://arxiv.org/abs/2509.22153
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Speaker Anonymisation for Speech-based Suicide Risk Detection
par: Cui, Ziyun, et autres
Publié: (2025)

Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
par: Cui, Ziyun, et autres
Publié: (2024)

The 1st SpeechWellness Challenge: Detecting Suicide Risk Among Adolescents
par: Wu, Wen, et autres
Publié: (2025)

SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations
par: Yang, Xiaoyu, et autres
Publié: (2025)

Leveraging Large Language Models for Spontaneous Speech-Based Suicide Risk Detection
par: Gao, Yifan, et autres
Publié: (2025)

Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection
par: Cui, Ziyun, et autres
Publié: (2023)

FlexSpeech: Towards Stable, Controllable and Expressive Text-to-Speech
par: Ma, Linhan, et autres
Publié: (2025)

Efficient Speech Watermarking for Speech Synthesis via Progressive Knowledge Distillation
par: Cui, Yang, et autres
Publié: (2025)

Rare Word Recognition and Translation Without Fine-Tuning via Task Vector in Speech Models
par: Jing, Ruihao, et autres
Publié: (2025)

UniFlow: Unifying Speech Front-End Tasks via Continuous Generative Modeling
par: Wang, Ziqian, et autres
Publié: (2025)

Exploring Cross-Utterance Speech Contexts for Conformer-Transducer Speech Recognition Systems
par: Cui, Mingyu, et autres
Publié: (2025)

Non-Invasive Suicide Risk Prediction Through Speech Analysis
par: Amiriparian, Shahin, et autres
Publié: (2024)

SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation
par: Wang, Hui, et autres
Publié: (2025)

Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER
par: Zheng, Xiuwen, et autres
Publié: (2026)

Acoustic and Machine Learning Methods for Speech-Based Suicide Risk Assessment: A Systematic Review
par: Marie, Ambre, et autres
Publié: (2025)

Towards Probing Speech-Specific Risks in Large Multimodal Models: A Taxonomy, Benchmark, and Insights
par: Yang, Hao, et autres
Publié: (2024)

Quantifying Cross-Lingual Transfer in Paralinguistic Speech Tasks
par: Buitrago, Pol, et autres
Publié: (2026)

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
par: Jiang, Yuepeng, et autres
Publié: (2024)

Towards Explainable Spoofed Speech Attribution and Detection:a Probabilistic Approach for Characterizing Speech Synthesizer Components
par: Mishra, Jagabandhu, et autres
Publié: (2025)

S2ST-Omni: Hierarchical Language-Aware SpeechLLM Adaptation for Multilingual Speech-to-Speech Translation
par: Pan, Yu, et autres
Publié: (2025)

Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
par: Wang, Xin, et autres
Publié: (2025)

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking
par: Zhou, Junzuo, et autres
Publié: (2024)

Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks
par: Hsu, Ming-Hao, et autres
Publié: (2023)

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network
par: Chen, Yanan, et autres
Publié: (2024)

Distributed Asynchronous Device Speech Enhancement via Windowed Cross-Attention
par: Yang, Gene-Ping, et autres
Publié: (2025)

Towards Attribution of Generators and Emotional Manipulation in Cross-Lingual Synthetic Speech using Geometric Learning
par: Girish, et autres
Publié: (2025)

Comparison of Speech Tasks in Human Expert and Machine Detection of Parkinson's Disease
par: Plantinga, Peter, et autres
Publié: (2025)

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
par: Chen, Xinhui, et autres
Publié: (2021)

A Neural Speech Codec for Noise Robust Speech Coding
par: Huang, Jiayi, et autres
Publié: (2023)

UrduSpeech: A 156-Hour Urdu Speech Corpus with 12-Dimension Paralinguistic Annotations
par: Haq, Attia Nafees ul, et autres
Publié: (2026)

Towards Ultra-Low-Power Neuromorphic Speech Enhancement with Spiking-FullSubNet
par: Hao, Xiang, et autres
Publié: (2024)

dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition
par: Tian, Wenjie, et autres
Publié: (2026)

Cross-Utterance Conditioned VAE for Speech Generation
par: Li, Yang, et autres
Publié: (2023)

Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation
par: Liu, Wenrui, et autres
Publié: (2025)

SALSA: Speech Aware LLM Adaptation via Learned Steering Activation Vectors
par: Yegorova, Yekaterina, et autres
Publié: (2026)

BiCrossMamba-ST: Speech Deepfake Detection with Bidirectional Mamba Spectro-Temporal Cross-Attention
par: Kheir, Yassine El, et autres
Publié: (2025)

Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation
par: Tao, Fuxiang, et autres
Publié: (2026)

Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning
par: Xue, Hongfei, et autres
Publié: (2025)

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR
par: Kumar, Shashi, et autres
Publié: (2024)

From Sharpness to Better Generalization for Speech Deepfake Detection
par: Huang, Wen, et autres
Publié: (2025)