Saved in:
| Main Authors: | Xu, Pengyu, Li, Shijia, Sun, Ao, Zhang, Feng, Li, Yahan, Wu, Bo, Ma, Zhanyu, Li, Jiguo, Xu, Jun, Gao, Jiuchong, Hao, Jinghua, He, Renqing, Wang, Rui, Liu, Yang, Hu, Xiaobo, Yang, Fan, Zheng, Jia, Yao, Guanghua |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.21244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LTS-VoiceAgent: A Listen-Think-Speak Framework for Efficient Streaming Voice Interaction via Semantic Triggering and Incremental Reasoning
by: Zou, Wenhao, et al.
Published: (2026)
by: Zou, Wenhao, et al.
Published: (2026)
Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments
by: Shi, Qinglong, et al.
Published: (2026)
by: Shi, Qinglong, et al.
Published: (2026)
GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR
by: Zhang, Jiaying, et al.
Published: (2026)
by: Zhang, Jiaying, et al.
Published: (2026)
UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning
by: Zhang, Feng, et al.
Published: (2026)
by: Zhang, Feng, et al.
Published: (2026)
MUSE: Multi-Domain Chinese User Simulation via Self-Evolving Profiles and Rubric-Guided Alignment
by: Liu, Zihao, et al.
Published: (2026)
by: Liu, Zihao, et al.
Published: (2026)
Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
by: Zhang, Nonghai, et al.
Published: (2026)
by: Zhang, Nonghai, et al.
Published: (2026)
Efficient Paths and Dense Rewards: Probabilistic Flow Reasoning for Large Language Models
by: Liu, Yan, et al.
Published: (2026)
by: Liu, Yan, et al.
Published: (2026)
Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management
by: Ma, Weitao, et al.
Published: (2026)
by: Ma, Weitao, et al.
Published: (2026)
State Rank Dynamics in Linear Attention LLMs
by: Sun, Ao, et al.
Published: (2026)
by: Sun, Ao, et al.
Published: (2026)
EchoVoices: Preserving Generational Voices and Memories for Seniors and Children
by: Xu, Haiying, et al.
Published: (2025)
by: Xu, Haiying, et al.
Published: (2025)
From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)
by: Xu, Ke, et al.
Published: (2026)
VoiceAgentRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures
by: Qiu, Jielin, et al.
Published: (2026)
by: Qiu, Jielin, et al.
Published: (2026)
VoiceAgentBench: Are Voice Assistants ready for agentic tasks?
by: Jain, Dhruv, et al.
Published: (2025)
by: Jain, Dhruv, et al.
Published: (2025)
WebNav: An Intelligent Agent for Voice-Controlled Web Navigation
by: Srinivasan, Trisanth, et al.
Published: (2025)
by: Srinivasan, Trisanth, et al.
Published: (2025)
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
by: Wan, Zhen, et al.
Published: (2025)
by: Wan, Zhen, et al.
Published: (2025)
Hidden States Know Where Reasoning Diverges: Credit Assignment via Span-Level Wasserstein Distance
by: Chen, Xinzhu, et al.
Published: (2026)
by: Chen, Xinzhu, et al.
Published: (2026)
Aegis: Towards Governance, Integrity, and Security of AI Voice Agents
by: Li, Xiang, et al.
Published: (2026)
by: Li, Xiang, et al.
Published: (2026)
From Simple to Professional: A Combinatorial Controllable Image Captioning Agent
by: Wang, Xinran, et al.
Published: (2024)
by: Wang, Xinran, et al.
Published: (2024)
Voice-guided Orchestrated Intelligence for Clinical Evaluation (VOICE): A Voice AI Agent System for Prehospital Stroke Assessment
by: Acosta, Julian, et al.
Published: (2025)
by: Acosta, Julian, et al.
Published: (2025)
ClonEval: An Open Voice Cloning Benchmark
by: Christop, Iwona, et al.
Published: (2025)
by: Christop, Iwona, et al.
Published: (2025)
Back to Basics: Revisiting ASR in the Age of Voice Agents
by: Tay, Geeyang, et al.
Published: (2026)
by: Tay, Geeyang, et al.
Published: (2026)
i-LAVA: Insights on Low Latency Voice-2-Voice Architecture for Agents
by: Purwar, Anupam, et al.
Published: (2025)
by: Purwar, Anupam, et al.
Published: (2025)
$τ$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains
by: Ray, Soham, et al.
Published: (2026)
by: Ray, Soham, et al.
Published: (2026)
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
by: Wang, Ke, et al.
Published: (2025)
by: Wang, Ke, et al.
Published: (2025)
MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
by: Luo, Renqing, et al.
Published: (2024)
by: Luo, Renqing, et al.
Published: (2024)
AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
by: Lin, Yueqian, et al.
Published: (2025)
by: Lin, Yueqian, et al.
Published: (2025)
AlphaEval: Evaluating Agents in Production
by: Lu, Pengrui, et al.
Published: (2026)
by: Lu, Pengrui, et al.
Published: (2026)
MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments
by: Gao, Yicheng, et al.
Published: (2026)
by: Gao, Yicheng, et al.
Published: (2026)
X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning
by: Xu, Rixi, et al.
Published: (2026)
by: Xu, Rixi, et al.
Published: (2026)
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)
by: Sha, Binzhu, et al.
Published: (2023)
MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions
by: Huang, Kexin, et al.
Published: (2026)
by: Huang, Kexin, et al.
Published: (2026)
VoiceMark: Zero-Shot Voice Cloning-Resistant Watermarking Approach Leveraging Speaker-Specific Latents
by: Li, Haiyun, et al.
Published: (2025)
by: Li, Haiyun, et al.
Published: (2025)
SkillMaster: Toward Autonomous Skill Mastery in LLM Agents
by: Yang, Min, et al.
Published: (2026)
by: Yang, Min, et al.
Published: (2026)
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
by: Yang, Songlin, et al.
Published: (2026)
by: Yang, Songlin, et al.
Published: (2026)
SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional Testset
by: Zhou, Yiquan, et al.
Published: (2025)
by: Zhou, Yiquan, et al.
Published: (2025)
Marco-Voice Technical Report
by: Tian, Fengping, et al.
Published: (2025)
by: Tian, Fengping, et al.
Published: (2025)
Claw-Eval: Towards Trustworthy Evaluation of Autonomous Agents
by: Ye, Bowen, et al.
Published: (2026)
by: Ye, Bowen, et al.
Published: (2026)
The Voice: Lessons on Trustworthy Conversational Agents from "Dune"
by: Feldman, Philip
Published: (2024)
by: Feldman, Philip
Published: (2024)
VoiceSculptor: Your Voice, Designed By You
by: Hu, Jingbin, et al.
Published: (2026)
by: Hu, Jingbin, et al.
Published: (2026)
VoiceBench: Benchmarking LLM-Based Voice Assistants
by: Chen, Yiming, et al.
Published: (2024)
by: Chen, Yiming, et al.
Published: (2024)
Similar Items
-
LTS-VoiceAgent: A Listen-Think-Speak Framework for Efficient Streaming Voice Interaction via Semantic Triggering and Incremental Reasoning
by: Zou, Wenhao, et al.
Published: (2026) -
Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments
by: Shi, Qinglong, et al.
Published: (2026) -
GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR
by: Zhang, Jiaying, et al.
Published: (2026) -
UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning
by: Zhang, Feng, et al.
Published: (2026) -
MUSE: Multi-Domain Chinese User Simulation via Self-Evolving Profiles and Rubric-Guided Alignment
by: Liu, Zihao, et al.
Published: (2026)