Saved in:
| Main Authors: | Yang, Jianing, Fujita, Yusuke, Sudo, Yui |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.09180 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models
by: Riera, Pablo, et al.
Published: (2026)
by: Riera, Pablo, et al.
Published: (2026)
Enabling Conversational Behavior Reasoning Capabilities in Full-Duplex Speech
by: Pan, Shuchang, et al.
Published: (2025)
by: Pan, Shuchang, et al.
Published: (2025)
Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems
by: Li, Guojian, et al.
Published: (2025)
by: Li, Guojian, et al.
Published: (2025)
Chronological Thinking in Full-Duplex Spoken Dialogue Language Models
by: Wu, Donghang, et al.
Published: (2025)
by: Wu, Donghang, et al.
Published: (2025)
JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems
by: Yang, Guangzhao, et al.
Published: (2026)
by: Yang, Guangzhao, et al.
Published: (2026)
The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines?
by: Billa, Jayadev
Published: (2026)
by: Billa, Jayadev
Published: (2026)
Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition
by: Shi, Hao, et al.
Published: (2025)
by: Shi, Hao, et al.
Published: (2025)
MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
by: Zhang, He, et al.
Published: (2025)
by: Zhang, He, et al.
Published: (2025)
Unit-Based Agent for Semi-Cascaded Full-Duplex Dialogue Systems
by: Yu, Haoyuan, et al.
Published: (2026)
by: Yu, Haoyuan, et al.
Published: (2026)
ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models
by: Hsiao, Chi-Yuan, et al.
Published: (2026)
by: Hsiao, Chi-Yuan, et al.
Published: (2026)
FD-Bench: A Full-Duplex Benchmarking Pipeline Designed for Full Duplex Spoken Dialogue Systems
by: Peng, Yizhou, et al.
Published: (2025)
by: Peng, Yizhou, et al.
Published: (2025)
Privacy-Preserving End-to-End Full-Duplex Speech Dialogue Models
by: Kuzmin, Nikita, et al.
Published: (2026)
by: Kuzmin, Nikita, et al.
Published: (2026)
PersonaKit (PK): A Plug-and-Play Platform for User Testing Diverse Roles in Full-Duplex Dialogue
by: Jeon, Hyunbae, et al.
Published: (2026)
by: Jeon, Hyunbae, et al.
Published: (2026)
Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization
by: Zhao, Mengjie, et al.
Published: (2026)
by: Zhao, Mengjie, et al.
Published: (2026)
OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2025)
by: Sudo, Yui, et al.
Published: (2025)
From Signal to Turn: Interactional Friction in Modular Speech-to-Speech Pipelines
by: Mairittha, Tittaya, et al.
Published: (2025)
by: Mairittha, Tittaya, et al.
Published: (2025)
Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
by: Koneru, Sai, et al.
Published: (2024)
by: Koneru, Sai, et al.
Published: (2024)
FlexDuo: A Pluggable System for Enabling Full-Duplex Capabilities in Speech Dialogue Systems
by: Liao, Borui, et al.
Published: (2025)
by: Liao, Borui, et al.
Published: (2025)
AC/DC: LLM-based Audio Comprehension via Dialogue Continuation
by: Fujita, Yusuke, et al.
Published: (2025)
by: Fujita, Yusuke, et al.
Published: (2025)
MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
by: Du, Yuhao, et al.
Published: (2025)
by: Du, Yuhao, et al.
Published: (2025)
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
by: Ma, Ziyang, et al.
Published: (2023)
by: Ma, Ziyang, et al.
Published: (2023)
LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
by: Zhang, Hao, et al.
Published: (2025)
by: Zhang, Hao, et al.
Published: (2025)
FLM-Audio: Natural Monologues Improves Native Full-Duplex Chatbots via Dual Training
by: Yao, Yiqun, et al.
Published: (2025)
by: Yao, Yiqun, et al.
Published: (2025)
SALM-Duplex: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
by: Hu, Ke, et al.
Published: (2025)
by: Hu, Ke, et al.
Published: (2025)
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
by: Lu, Xiangyu, et al.
Published: (2025)
by: Lu, Xiangyu, et al.
Published: (2025)
EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions
by: Modi, Smit Nautambhai, et al.
Published: (2026)
by: Modi, Smit Nautambhai, et al.
Published: (2026)
CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades
by: Chang, Raeyoung, et al.
Published: (2026)
by: Chang, Raeyoung, et al.
Published: (2026)
Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations
by: Singh, Bhaskar, et al.
Published: (2026)
by: Singh, Bhaskar, et al.
Published: (2026)
Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents
by: Veluri, Bandhav, et al.
Published: (2024)
by: Veluri, Bandhav, et al.
Published: (2024)
Streaming Translation and Transcription Through Speech-to-Text Causal Alignment
by: Koshkin, Roman, et al.
Published: (2026)
by: Koshkin, Roman, et al.
Published: (2026)
TurnGuide: Enhancing Meaningful Full Duplex Spoken Interactions via Dynamic Turn-Level Text-Speech Interleaving
by: Cui, Wenqian, et al.
Published: (2025)
by: Cui, Wenqian, et al.
Published: (2025)
UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction
by: Li, Yadong, et al.
Published: (2026)
by: Li, Yadong, et al.
Published: (2026)
Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT
by: Ahmed, Zeeshan, et al.
Published: (2025)
by: Ahmed, Zeeshan, et al.
Published: (2025)
PersonaPlex: Voice and Role Control for Full Duplex Conversational Speech Models
by: Roy, Rajarshi, et al.
Published: (2026)
by: Roy, Rajarshi, et al.
Published: (2026)
MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues
by: Binici, Kuluhan, et al.
Published: (2024)
by: Binici, Kuluhan, et al.
Published: (2024)
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
by: Min, Anna, et al.
Published: (2025)
by: Min, Anna, et al.
Published: (2025)
Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition
by: Juvekar, Kush, et al.
Published: (2026)
by: Juvekar, Kush, et al.
Published: (2026)
Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
by: Lin, Guan-Ting, et al.
Published: (2025)
by: Lin, Guan-Ting, et al.
Published: (2025)
LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning
by: Zhu, Yu, et al.
Published: (2026)
by: Zhu, Yu, et al.
Published: (2026)
From Turn-Taking to Synchronous Dialogue: A Survey of Full-Duplex Spoken Language Models
by: Chen, Yuxuan, et al.
Published: (2025)
by: Chen, Yuxuan, et al.
Published: (2025)
Similar Items
-
Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models
by: Riera, Pablo, et al.
Published: (2026) -
Enabling Conversational Behavior Reasoning Capabilities in Full-Duplex Speech
by: Pan, Shuchang, et al.
Published: (2025) -
Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems
by: Li, Guojian, et al.
Published: (2025) -
Chronological Thinking in Full-Duplex Spoken Dialogue Language Models
by: Wu, Donghang, et al.
Published: (2025) -
JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems
by: Yang, Guangzhao, et al.
Published: (2026)