:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Jianing, Fujita, Yusuke, Sudo, Yui
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.09180
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models
by: Riera, Pablo, et al.
Published: (2026)

Enabling Conversational Behavior Reasoning Capabilities in Full-Duplex Speech
by: Pan, Shuchang, et al.
Published: (2025)

Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems
by: Li, Guojian, et al.
Published: (2025)

Chronological Thinking in Full-Duplex Spoken Dialogue Language Models
by: Wu, Donghang, et al.
Published: (2025)

JAL-Turn: Joint Acoustic-Linguistic Modeling for Real-Time and Robust Turn-Taking Detection in Full-Duplex Spoken Dialogue Systems
by: Yang, Guangzhao, et al.
Published: (2026)

The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines?
by: Billa, Jayadev
Published: (2026)

Serialized Output Prompting for Large Language Model-based Multi-Talker Speech Recognition
by: Shi, Hao, et al.
Published: (2025)

MTR-DuplexBench: Towards a Comprehensive Evaluation of Multi-Round Conversations for Full-Duplex Speech Language Models
by: Zhang, He, et al.
Published: (2025)

Unit-Based Agent for Semi-Cascaded Full-Duplex Dialogue Systems
by: Yu, Haoyuan, et al.
Published: (2026)

ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models
by: Hsiao, Chi-Yuan, et al.
Published: (2026)

FD-Bench: A Full-Duplex Benchmarking Pipeline Designed for Full Duplex Spoken Dialogue Systems
by: Peng, Yizhou, et al.
Published: (2025)

Privacy-Preserving End-to-End Full-Duplex Speech Dialogue Models
by: Kuzmin, Nikita, et al.
Published: (2026)

PersonaKit (PK): A Plug-and-Play Platform for User Testing Diverse Roles in Full-Duplex Dialogue
by: Jeon, Hyunbae, et al.
Published: (2026)

Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization
by: Zhao, Mengjie, et al.
Published: (2026)

OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2025)

From Signal to Turn: Interactional Friction in Modular Speech-to-Speech Pipelines
by: Mairittha, Tittaya, et al.
Published: (2025)

Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024
by: Koneru, Sai, et al.
Published: (2024)

FlexDuo: A Pluggable System for Enabling Full-Duplex Capabilities in Speech Dialogue Systems
by: Liao, Borui, et al.
Published: (2025)

AC/DC: LLM-based Audio Comprehension via Dialogue Continuation
by: Fujita, Yusuke, et al.
Published: (2025)

MTalk-Bench: Evaluating Speech-to-Speech Models in Multi-Turn Dialogues via Arena-style and Rubrics Protocols
by: Du, Yuhao, et al.
Published: (2025)

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
by: Ma, Ziyang, et al.
Published: (2023)

LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems
by: Zhang, Hao, et al.
Published: (2025)

FLM-Audio: Natural Monologues Improves Native Full-Duplex Chatbots via Dual Training
by: Yao, Yiqun, et al.
Published: (2025)

SALM-Duplex: Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model
by: Hu, Ke, et al.
Published: (2025)

DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities
by: Lu, Xiangyu, et al.
Published: (2025)

EchoChain: A Full-Duplex Benchmark for State-Update Reasoning Under Interruptions
by: Modi, Smit Nautambhai, et al.
Published: (2026)

CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades
by: Chang, Raeyoung, et al.
Published: (2026)

Human-1 by Josh Talks: A Full-Duplex Conversational Modeling Framework in Hindi using Real-World Conversations
by: Singh, Bhaskar, et al.
Published: (2026)

Beyond Turn-Based Interfaces: Synchronous LLMs as Full-Duplex Dialogue Agents
by: Veluri, Bandhav, et al.
Published: (2024)

Streaming Translation and Transcription Through Speech-to-Text Causal Alignment
by: Koshkin, Roman, et al.
Published: (2026)

TurnGuide: Enhancing Meaningful Full Duplex Spoken Interactions via Dynamic Turn-Level Text-Speech Interleaving
by: Cui, Wenqian, et al.
Published: (2025)

UAF: A Unified Audio Front-end LLM for Full-Duplex Speech Interaction
by: Li, Yadong, et al.
Published: (2026)

Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT
by: Ahmed, Zeeshan, et al.
Published: (2025)

PersonaPlex: Voice and Role Control for Full Duplex Conversational Speech Models
by: Roy, Rajarshi, et al.
Published: (2026)

MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues
by: Binici, Kuluhan, et al.
Published: (2024)

When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation
by: Min, Anna, et al.
Published: (2025)

Vividh-ASR: A Complexity-Tiered Benchmark and Optimization Dynamics for Robust Indic Speech Recognition
by: Juvekar, Kush, et al.
Published: (2026)

Full-Duplex-Bench: A Benchmark to Evaluate Full-duplex Spoken Dialogue Models on Turn-taking Capabilities
by: Lin, Guan-Ting, et al.
Published: (2025)

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning
by: Zhu, Yu, et al.
Published: (2026)

From Turn-Taking to Synchronous Dialogue: A Survey of Full-Duplex Spoken Language Models
by: Chen, Yuxuan, et al.
Published: (2025)