:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chaparala, Kaavya, Thebaud, Thomas, López, Jesús Villalba, Moro-Velazquez, Laureano, Viechnicki, Peter, Dehak, Najim
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.17652
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation
by: Thebaud, Thomas, et al.
Published: (2026)

Noise-robust Speech Separation with Fast Generative Correction
by: Wang, Helin, et al.
Published: (2024)

Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation
by: Lu, Yen-Ju, et al.
Published: (2025)

Detecting Neurodegenerative Diseases using Frame-Level Handwriting Embeddings
by: Laouedj, Sarah, et al.
Published: (2025)

MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances
by: Lee, Junhyeok, et al.
Published: (2025)

Cross-Corpus and Cross-domain Handwriting Assessment of NeuroDegenerative Diseases via Time-Series-to-Image Conversion
by: Chavez, Gabrielle, et al.
Published: (2025)

CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
by: Lu, Yen-Ju, et al.
Published: (2024)

SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline
by: Wang, Helin, et al.
Published: (2025)

Reconstruct! Don't Encode: Self-Supervised Representation Reconstruction Loss for High-Intelligibility and Low-Latency Streaming Neural Audio Codec
by: Lee, Junhyeok, et al.
Published: (2026)

Spoken DialogSum: An Emotion-Rich Conversational Dataset for Spoken Dialogue Summarization
by: Lu, Yen-Ju, et al.
Published: (2025)

Study of Pre-processing Defenses against Adversarial Attacks on State-of-the-art Speaker Recognition Systems
by: Joshi, Sonal, et al.
Published: (2021)

Dynamics of Handwriting for Cognitive Assessment
by: Gabrielle Chavez, et al.
Published: (2024)

Analyzing Attention Focus in the Cookie TheftPicture Description Task Using Word Alignment
by: Anna Favaro, et al.
Published: (2024)

Cognitive Assessment through Writing Tasks
by: Casey Chen, et al.
Published: (2024)

Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification
by: Joshi, Sonal, et al.
Published: (2024)

DiT-Flow: Speech Enhancement Robust to Multiple Distortions based on Flow Matching in Latent Space and Diffusion Transformers
by: Cao, Tianyu, et al.
Published: (2026)

Enhancing Dialogue Annotation with Speaker Characteristics Leveraging a Frozen LLM
by: Thebaud, Thomas, et al.
Published: (2025)

Demographic Attributes Prediction from Speech Using WavLM Embeddings
by: Yang, Yuchen, et al.
Published: (2025)

Interpretable Features for the Assessment of Neurodegenerative Diseases through Handwriting Analysis
by: Thebaud, Thomas, et al.
Published: (2024)

Multimodal characterization of Alzheimer's Disease using speech, eye movement, and handwriting
by: Laureano Moro‐Velazquez, et al.
Published: (2024)

Multimodal characterization of Alzheimer’s Disease using speech, eye movement, and handwriting
by: Laureano Moro‐Velazquez, et al.
Published: (2024)

Where Do Backdoors Live? A Component-Level Analysis of Backdoor Propagation in Speech Language Models
by: Fortier, Alexandrine, et al.
Published: (2025)

Time Scale Network: A Shallow Neural Network For Time Series Data
by: Meyer, Trevor, et al.
Published: (2023)

CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
by: Wang, Helin, et al.
Published: (2025)

Multi-Target Backdoor Attacks Against Speaker Recognition
by: Fortier, Alexandrine, et al.
Published: (2025)

ReFESS-QI: Reference-Free Evaluation For Speech Separation With Joint Quality And Intelligibility Scoring
by: Frummer, Ari, et al.
Published: (2025)

Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits
by: Feng, Tiantian, et al.
Published: (2025)

Multimodal Analysis of Behavior During Stroop Test for Characterization of Alzheimer’s Disease Signs
by: Trevor Meyer, et al.
Published: (2024)

Clean Label Attacks against SLU Systems
by: Xinyuan, Henry Li, et al.
Published: (2024)

Adversarial Attacks and Defenses for Speech Recognition Systems
by: Żelasko, Piotr, et al.
Published: (2021)

Mai Ho'omāuna i ka 'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian
by: Chaparala, Kaavya, et al.
Published: (2024)

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
by: Wang, Helin, et al.
Published: (2024)

GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Models
by: Guan, Yaohan, et al.
Published: (2026)

SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024)

Language model integration based on memory control for sequence to sequence speech recognition
by: Cho, Jaejin, et al.
Published: (2018)

Latent Speech-Text Transformer
by: Lu, Yen-Ju, et al.
Published: (2025)

Layer-Aware Early Fusion of Acoustic and Linguistic Embeddings for Cognitive Status Classification
by: Novotny, Krystof, et al.
Published: (2026)

Measurement of the Granularity of Vowel Production Space By Just Producible Different (JPD) Limens
by: Viechnicki, Peter
Published: (2025)

SAM Audio Judge: A Unified Multimodal Framework for Perceptual Evaluation of Audio Separation
by: Wang, Helin, et al.
Published: (2026)

Automatic Proficiency Assessment in L2 English Learners
by: Mohammadi, Armita, et al.
Published: (2025)