Saved in:
| Main Authors: | Chaparala, Kaavya, Thebaud, Thomas, López, Jesús Villalba, Moro-Velazquez, Laureano, Viechnicki, Peter, Dehak, Najim |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.17652 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation
by: Thebaud, Thomas, et al.
Published: (2026)
by: Thebaud, Thomas, et al.
Published: (2026)
Noise-robust Speech Separation with Fast Generative Correction
by: Wang, Helin, et al.
Published: (2024)
by: Wang, Helin, et al.
Published: (2024)
Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation
by: Lu, Yen-Ju, et al.
Published: (2025)
by: Lu, Yen-Ju, et al.
Published: (2025)
Detecting Neurodegenerative Diseases using Frame-Level Handwriting Embeddings
by: Laouedj, Sarah, et al.
Published: (2025)
by: Laouedj, Sarah, et al.
Published: (2025)
MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances
by: Lee, Junhyeok, et al.
Published: (2025)
by: Lee, Junhyeok, et al.
Published: (2025)
Cross-Corpus and Cross-domain Handwriting Assessment of NeuroDegenerative Diseases via Time-Series-to-Image Conversion
by: Chavez, Gabrielle, et al.
Published: (2025)
by: Chavez, Gabrielle, et al.
Published: (2025)
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
by: Lu, Yen-Ju, et al.
Published: (2024)
by: Lu, Yen-Ju, et al.
Published: (2024)
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative Pipeline
by: Wang, Helin, et al.
Published: (2025)
by: Wang, Helin, et al.
Published: (2025)
Reconstruct! Don't Encode: Self-Supervised Representation Reconstruction Loss for High-Intelligibility and Low-Latency Streaming Neural Audio Codec
by: Lee, Junhyeok, et al.
Published: (2026)
by: Lee, Junhyeok, et al.
Published: (2026)
Spoken DialogSum: An Emotion-Rich Conversational Dataset for Spoken Dialogue Summarization
by: Lu, Yen-Ju, et al.
Published: (2025)
by: Lu, Yen-Ju, et al.
Published: (2025)
Study of Pre-processing Defenses against Adversarial Attacks on State-of-the-art Speaker Recognition Systems
by: Joshi, Sonal, et al.
Published: (2021)
by: Joshi, Sonal, et al.
Published: (2021)
Dynamics of Handwriting for Cognitive Assessment
by: Gabrielle Chavez, et al.
Published: (2024)
by: Gabrielle Chavez, et al.
Published: (2024)
Analyzing Attention Focus in the Cookie TheftPicture Description Task Using Word Alignment
by: Anna Favaro, et al.
Published: (2024)
by: Anna Favaro, et al.
Published: (2024)
Cognitive Assessment through Writing Tasks
by: Casey Chen, et al.
Published: (2024)
by: Casey Chen, et al.
Published: (2024)
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification
by: Joshi, Sonal, et al.
Published: (2024)
by: Joshi, Sonal, et al.
Published: (2024)
DiT-Flow: Speech Enhancement Robust to Multiple Distortions based on Flow Matching in Latent Space and Diffusion Transformers
by: Cao, Tianyu, et al.
Published: (2026)
by: Cao, Tianyu, et al.
Published: (2026)
Enhancing Dialogue Annotation with Speaker Characteristics Leveraging a Frozen LLM
by: Thebaud, Thomas, et al.
Published: (2025)
by: Thebaud, Thomas, et al.
Published: (2025)
Demographic Attributes Prediction from Speech Using WavLM Embeddings
by: Yang, Yuchen, et al.
Published: (2025)
by: Yang, Yuchen, et al.
Published: (2025)
Interpretable Features for the Assessment of Neurodegenerative Diseases through Handwriting Analysis
by: Thebaud, Thomas, et al.
Published: (2024)
by: Thebaud, Thomas, et al.
Published: (2024)
Multimodal characterization of Alzheimer's Disease using speech, eye movement, and handwriting
by: Laureano Moro‐Velazquez, et al.
Published: (2024)
by: Laureano Moro‐Velazquez, et al.
Published: (2024)
Multimodal characterization of Alzheimer’s Disease using speech, eye movement, and handwriting
by: Laureano Moro‐Velazquez, et al.
Published: (2024)
by: Laureano Moro‐Velazquez, et al.
Published: (2024)
Where Do Backdoors Live? A Component-Level Analysis of Backdoor Propagation in Speech Language Models
by: Fortier, Alexandrine, et al.
Published: (2025)
by: Fortier, Alexandrine, et al.
Published: (2025)
Time Scale Network: A Shallow Neural Network For Time Series Data
by: Meyer, Trevor, et al.
Published: (2023)
by: Meyer, Trevor, et al.
Published: (2023)
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
by: Wang, Helin, et al.
Published: (2025)
by: Wang, Helin, et al.
Published: (2025)
Multi-Target Backdoor Attacks Against Speaker Recognition
by: Fortier, Alexandrine, et al.
Published: (2025)
by: Fortier, Alexandrine, et al.
Published: (2025)
ReFESS-QI: Reference-Free Evaluation For Speech Separation With Joint Quality And Intelligibility Scoring
by: Frummer, Ari, et al.
Published: (2025)
by: Frummer, Ari, et al.
Published: (2025)
Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits
by: Feng, Tiantian, et al.
Published: (2025)
by: Feng, Tiantian, et al.
Published: (2025)
Multimodal Analysis of Behavior During Stroop Test for Characterization of Alzheimer’s Disease Signs
by: Trevor Meyer, et al.
Published: (2024)
by: Trevor Meyer, et al.
Published: (2024)
Clean Label Attacks against SLU Systems
by: Xinyuan, Henry Li, et al.
Published: (2024)
by: Xinyuan, Henry Li, et al.
Published: (2024)
Adversarial Attacks and Defenses for Speech Recognition Systems
by: Żelasko, Piotr, et al.
Published: (2021)
by: Żelasko, Piotr, et al.
Published: (2021)
Mai Ho'omāuna i ka 'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian
by: Chaparala, Kaavya, et al.
Published: (2024)
by: Chaparala, Kaavya, et al.
Published: (2024)
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
by: Wang, Helin, et al.
Published: (2024)
by: Wang, Helin, et al.
Published: (2024)
GENFIG1: Visual Summaries of Scholarly Work as a Challenge for Vision-Language Models
by: Guan, Yaohan, et al.
Published: (2026)
by: Guan, Yaohan, et al.
Published: (2026)
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer
by: Wang, Helin, et al.
Published: (2024)
by: Wang, Helin, et al.
Published: (2024)
Language model integration based on memory control for sequence to sequence speech recognition
by: Cho, Jaejin, et al.
Published: (2018)
by: Cho, Jaejin, et al.
Published: (2018)
Latent Speech-Text Transformer
by: Lu, Yen-Ju, et al.
Published: (2025)
by: Lu, Yen-Ju, et al.
Published: (2025)
Layer-Aware Early Fusion of Acoustic and Linguistic Embeddings for Cognitive Status Classification
by: Novotny, Krystof, et al.
Published: (2026)
by: Novotny, Krystof, et al.
Published: (2026)
Measurement of the Granularity of Vowel Production Space By Just Producible Different (JPD) Limens
by: Viechnicki, Peter
Published: (2025)
by: Viechnicki, Peter
Published: (2025)
SAM Audio Judge: A Unified Multimodal Framework for Perceptual Evaluation of Audio Separation
by: Wang, Helin, et al.
Published: (2026)
by: Wang, Helin, et al.
Published: (2026)
Automatic Proficiency Assessment in L2 English Learners
by: Mohammadi, Armita, et al.
Published: (2025)
by: Mohammadi, Armita, et al.
Published: (2025)
Similar Items
-
Speaker Verification with Speech-Aware LLMs: Evaluation and Augmentation
by: Thebaud, Thomas, et al.
Published: (2026) -
Noise-robust Speech Separation with Fast Generative Correction
by: Wang, Helin, et al.
Published: (2024) -
Paired by the Teacher: Turning Unpaired Data into High-Fidelity Pairs for Low-Resource Text Generation
by: Lu, Yen-Ju, et al.
Published: (2025) -
Detecting Neurodegenerative Diseases using Frame-Level Handwriting Embeddings
by: Laouedj, Sarah, et al.
Published: (2025) -
MaskVCT: Masked Voice Codec Transformer for Zero-Shot Voice Conversion With Increased Controllability via Multiple Guidances
by: Lee, Junhyeok, et al.
Published: (2025)