:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Siriwardena, Yashish M., Swedlow, Nathan, Howard, Audrey, Gitterman, Evan, Darcy, Dan, Espy-Wilson, Carol, Fanelli, Andrea
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.05947
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables
by: Attia, Ahmed Adel, et al.
Published: (2023)

Analyzing the Impact of Accent on English Speech: Acoustic and Articulatory Perspectives
by: Premananth, Gowtham, et al.
Published: (2025)

A multi-modal approach for identifying schizophrenia using cross-modal attention
by: Premananth, Gowtham, et al.
Published: (2023)

A Multimodal Framework for the Assessment of the Schizophrenia Spectrum
by: Premananth, Gowtham, et al.
Published: (2024)

Quantifying Articulatory Coordination as a Biomarker for Schizophrenia
by: Premananth, Gowtham, et al.
Published: (2025)

Acoustic to Articulatory Speech Inversion for Children with Velopharyngeal Insufficiency
by: Tabatabaee, Saba, et al.
Published: (2025)

Enhancing Acoustic-to-Articulatory Speech Inversion by Incorporating Nasality
by: Tabatabaee, Saba, et al.
Published: (2025)

Perceptual Ratings Predict Speech Inversion Articulatory Kinematics in Childhood Speech Sound Disorders
by: Benway, Nina R., et al.
Published: (2025)

Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia Symptoms
by: Premananth, Gowtham, et al.
Published: (2024)

Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion
by: Attia, Ahmed Adel, et al.
Published: (2025)

Speech-Based Estimation of Schizophrenia Severity Using Feature Fusion
by: Premananth, Gowtham, et al.
Published: (2024)

Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)

FT-Boosted SV: Towards Noise Robust Speaker Verification for English Speaking Classroom Environments
by: Tabatabaee, Saba, et al.
Published: (2025)

A Computational Approach to Analyzing Disrupted Language in Schizophrenia: Integrating Surprisal and Coherence Measures
by: Premananth, Gowtham, et al.
Published: (2025)

Reverse Attention for Lightweight Speech Enhancement on Edge Devices
by: Ojha, Shuubham, et al.
Published: (2025)

On the Relationship between Accent Strength and Articulatory Features
by: Huang, Kevin, et al.
Published: (2025)

RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines
by: Attia, Ahmed Adel, et al.
Published: (2025)

Speech-Based Prioritization for Schizophrenia Intervention
by: Premananth, Gowtham, et al.
Published: (2025)

MacST: Multi-Accent Speech Synthesis via Text Transliteration for Accent Conversion
by: Inoue, Sho, et al.
Published: (2024)

Deep Speech Synthesis from Multimodal Articulatory Representations
by: Wu, Peter, et al.
Published: (2024)

From Weak Labels to Strong Results: Utilizing 5,000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data
by: Attia, Ahmed Adel, et al.
Published: (2025)

Teaching Machines to Speak Using Articulatory Control
by: Anand, Akshay, et al.
Published: (2025)

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
by: Attia, Ahmed Adel, et al.
Published: (2023)

Scalable Controllable Accented TTS
by: Xinyuan, Henry Li, et al.
Published: (2025)

Towards a Quantitative Analysis of Coarticulation with a Phoneme-to-Articulatory Model
by: Fan, Chaofei, et al.
Published: (2024)

RT-VC: Real-Time Zero-Shot Voice Conversion with Speech Articulatory Coding
by: Liu, Yisi, et al.
Published: (2025)

Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings
by: Attia, Ahmed Adel, et al.
Published: (2024)

Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation
by: Premananth, Gowtham, et al.
Published: (2025)

Acoustic-to-Articulatory Inversion of Clean Speech Using an MRI-Trained Model
by: Azzouz, Sofiane, et al.
Published: (2026)

SpeechAccentLLM: A Unified Framework for Foreign Accent Conversion and Text to Speech
by: Cheng, Zhuangfei, et al.
Published: (2025)

Multi-Scale Accent Modeling and Disentangling for Multi-Speaker Multi-Accent Text-to-Speech Synthesis
by: Zhou, Xuehao, et al.
Published: (2024)

Convert and Speak: Zero-shot Accent Conversion with Minimum Supervision
by: Jia, Zhijun, et al.
Published: (2024)

Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement
by: Nguyen, Tuan-Nam, et al.
Published: (2025)

Pairwise Evaluation of Accent Similarity in Speech Synthesis
by: Zhong, Jinzuomu, et al.
Published: (2025)

Activation Steering for Accent Adaptation in Speech Foundation Models
by: Sun, Jinuo, et al.
Published: (2026)

CPT-Boosted Wav2vec2.0: Towards Noise Robust Speech Recognition for Classroom Environments
by: Attia, Ahmed Adel, et al.
Published: (2024)

Rethinking Discrete Speech Representation Tokens for Accent Generation
by: Zhong, Jinzuomu, et al.
Published: (2026)

Simulating Articulatory Trajectories with Phonological Feature Interpolation
by: Tandazo, Angelo Ortiz, et al.
Published: (2024)

Activation Steering for Accent-Neutralized Zero-Shot Text-To-Speech
by: Yang, Mu, et al.
Published: (2026)

Non-autoregressive real-time Accent Conversion model with voice cloning
by: Nechaev, Vladimir, et al.
Published: (2024)