:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Pengyu, Li, Shijia, Sun, Ao, Zhang, Feng, Li, Yahan, Wu, Bo, Ma, Zhanyu, Li, Jiguo, Xu, Jun, Gao, Jiuchong, Hao, Jinghua, He, Renqing, Wang, Rui, Liu, Yang, Hu, Xiaobo, Yang, Fan, Zheng, Jia, Yao, Guanghua
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.21244
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LTS-VoiceAgent: A Listen-Think-Speak Framework for Efficient Streaming Voice Interaction via Semantic Triggering and Incremental Reasoning
by: Zou, Wenhao, et al.
Published: (2026)

Long-term Task-oriented Agent: Proactive Long-term Intent Maintenance in Dynamic Environments
by: Shi, Qinglong, et al.
Published: (2026)

GeoRA: Geometry-Aware Low-Rank Adaptation for RLVR
by: Zhang, Jiaying, et al.
Published: (2026)

UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning
by: Zhang, Feng, et al.
Published: (2026)

MUSE: Multi-Domain Chinese User Simulation via Self-Evolving Profiles and Rubric-Guided Alignment
by: Liu, Zihao, et al.
Published: (2026)

Silence the Judge: Reinforcement Learning with Self-Verifier via Latent Geometric Clustering
by: Zhang, Nonghai, et al.
Published: (2026)

Efficient Paths and Dense Rewards: Probabilistic Flow Reasoning for Large Language Models
by: Liu, Yan, et al.
Published: (2026)

Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management
by: Ma, Weitao, et al.
Published: (2026)

State Rank Dynamics in Linear Attention LLMs
by: Sun, Ao, et al.
Published: (2026)

EchoVoices: Preserving Generational Voices and Memories for Seniors and Children
by: Xu, Haiying, et al.
Published: (2025)

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench
by: Xu, Ke, et al.
Published: (2026)

VoiceAgentRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures
by: Qiu, Jielin, et al.
Published: (2026)

VoiceAgentBench: Are Voice Assistants ready for agentic tasks?
by: Jain, Dhruv, et al.
Published: (2025)

WebNav: An Intelligent Agent for Voice-Controlled Web Navigation
by: Srinivasan, Trisanth, et al.
Published: (2025)

SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
by: Wan, Zhen, et al.
Published: (2025)

Hidden States Know Where Reasoning Diverges: Credit Assignment via Span-Level Wasserstein Distance
by: Chen, Xinzhu, et al.
Published: (2026)

Aegis: Towards Governance, Integrity, and Security of AI Voice Agents
by: Li, Xiang, et al.
Published: (2026)

From Simple to Professional: A Combinatorial Controllable Image Captioning Agent
by: Wang, Xinran, et al.
Published: (2024)

Voice-guided Orchestrated Intelligence for Clinical Evaluation (VOICE): A Voice AI Agent System for Prehospital Stroke Assessment
by: Acosta, Julian, et al.
Published: (2025)

ClonEval: An Open Voice Cloning Benchmark
by: Christop, Iwona, et al.
Published: (2025)

Back to Basics: Revisiting ASR in the Age of Voice Agents
by: Tay, Geeyang, et al.
Published: (2026)

i-LAVA: Insights on Low Latency Voice-2-Voice Architecture for Agents
by: Purwar, Anupam, et al.
Published: (2025)

$τ$-Voice: Benchmarking Full-Duplex Voice Agents on Real-World Domains
by: Ray, Soham, et al.
Published: (2026)

VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing
by: Wang, Ke, et al.
Published: (2025)

MixTex: Unambiguous Recognition Should Not Rely Solely on Real Data
by: Luo, Renqing, et al.
Published: (2024)

AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
by: Lin, Yueqian, et al.
Published: (2025)

AlphaEval: Evaluating Agents in Production
by: Lu, Pengrui, et al.
Published: (2026)

MedExAgent: Training LLM Agents to Ask, Examine, and Diagnose in Noisy Clinical Environments
by: Gao, Yicheng, et al.
Published: (2026)

X-Voice: Enabling Everyone to Speak 30 Languages via Zero-Shot Cross-Lingual Voice Cloning
by: Xu, Rixi, et al.
Published: (2026)

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion
by: Sha, Binzhu, et al.
Published: (2023)

MOSS-VoiceGenerator: Create Realistic Voices with Natural Language Descriptions
by: Huang, Kexin, et al.
Published: (2026)

VoiceMark: Zero-Shot Voice Cloning-Resistant Watermarking Approach Leveraging Speaker-Specific Latents
by: Li, Haiyun, et al.
Published: (2025)

SkillMaster: Toward Autonomous Skill Mastery in LLM Agents
by: Yang, Min, et al.
Published: (2026)

EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
by: Yang, Songlin, et al.
Published: (2026)

SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional Testset
by: Zhou, Yiquan, et al.
Published: (2025)

Marco-Voice Technical Report
by: Tian, Fengping, et al.
Published: (2025)

Claw-Eval: Towards Trustworthy Evaluation of Autonomous Agents
by: Ye, Bowen, et al.
Published: (2026)

The Voice: Lessons on Trustworthy Conversational Agents from "Dune"
by: Feldman, Philip
Published: (2024)

VoiceSculptor: Your Voice, Designed By You
by: Hu, Jingbin, et al.
Published: (2026)

VoiceBench: Benchmarking LLM-Based Voice Assistants
by: Chen, Yiming, et al.
Published: (2024)