:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhu, Lei, Wang, Xiaobao, Yang, Jianbiao, Wang, Chenyang, He, Dongxiao, Wang, Longbiao, Dang, Jianwu
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.02277
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AIMDiT: Modality Augmentation and Interaction via Multimodal Dimension Transformation for Emotion Recognition in Conversations
by: Wu, Sheng, et al.
Published: (2024)

Enriching Multimodal Sentiment Analysis through Textual Emotional Descriptions of Visual-Audio Content
by: Wu, Sheng, et al.
Published: (2024)

Integration of Old and New Knowledge for Generalized Intent Discovery: A Consistency-driven Prototype-Prompting Framework
by: Wei, Xiao, et al.
Published: (2025)

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
by: Shu, Yuchun, et al.
Published: (2024)

Breaking Data Efficiency Dilemma: A Federated and Augmented Learning Framework For Alzheimer's Disease Detection via Speech
by: Wei, Xiao, et al.
Published: (2026)

ASDA: Audio Spectrogram Differential Attention Mechanism for Self-Supervised Representation Learning
by: Wang, Junyu, et al.
Published: (2025)

Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective
by: Jin, Di, et al.
Published: (2025)

A Dynamic Knowledge Update-Driven Model with Large Language Models for Fake News Detection
by: Jin, Di, et al.
Published: (2025)

Expressive Prompting: Improving Emotion Intensity and Speaker Consistency in Zero-Shot TTS
by: Wang, Haoyu, et al.
Published: (2024)

Pay More Attention To Audio: Mitigating Imbalance of Cross-Modal Attention in Large Audio Language Models
by: Wang, Junyu, et al.
Published: (2025)

Scrambled text: training Language Models to correct OCR errors using synthetic data
by: Bourne, Jonathan
Published: (2024)

POTSA: A Cross-Lingual Speech Alignment Framework for Speech-to-Text Translation
by: Li, Xuanchen, et al.
Published: (2025)

MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates
by: Huang, Zikang, et al.
Published: (2026)

InstructAudio: Unified speech and music generation with natural language instruction
by: Qiang, Chunyu, et al.
Published: (2025)

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
by: Gong, Cheng, et al.
Published: (2024)

SecoustiCodec: Cross-Modal Aligned Streaming Single-Codecbook Speech Codec
by: Qiang, Chunyu, et al.
Published: (2025)

Measuring short-form factuality in large language models
by: Wei, Jason, et al.
Published: (2024)

Long-form factuality in large language models
by: Wei, Jerry, et al.
Published: (2024)

UniSonate: A Unified Model for Speech, Music, and Sound Effect Generation with Text Instructions
by: Qiang, Chunyu, et al.
Published: (2026)

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing
by: Qiang, Chunyu, et al.
Published: (2024)

VERISCORE: Evaluating the factuality of verifiable claims in long-form text generation
by: Song, Yixiao, et al.
Published: (2024)

VeriFastScore: Speeding up long-form factuality evaluation
by: Rajendhran, Rishanth, et al.
Published: (2025)

Collaborative decoding of critical tokens for boosting factuality of large language models
by: Jin, Lifeng, et al.
Published: (2024)

MLRIP: Pre-training a military language representation model with informative factual knowledge and professional knowledge base
by: Li, Hui, et al.
Published: (2022)

Sailing by the Stars: A Survey on Reward Models and Learning Strategies for Learning from Rewards
by: Wu, Xiaobao
Published: (2025)

RE-Searcher: Robust Agentic Search with Goal-oriented Planning and Self-reflection
by: Fu, Daocheng, et al.
Published: (2025)

KG-BiLM: Knowledge Graph Embedding via Bidirectional Language Models
by: Chen, Zirui, et al.
Published: (2025)

LORT: Locally Refined Convolution and Taylor Transformer for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2025)

Mamba-SEUNet: Mamba UNet for Monaural Speech Enhancement
by: Wang, Junyu, et al.
Published: (2024)

AKEW: Assessing Knowledge Editing in the Wild
by: Wu, Xiaobao, et al.
Published: (2024)

Large Language Models, scientific knowledge and factuality: A framework to streamline human expert evaluation
by: Wysocka, Magdalena, et al.
Published: (2023)

Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction
by: Zhang, Didi, et al.
Published: (2025)

High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR
by: Banerjee, Sourav, et al.
Published: (2024)

Tag and correct: high precision post-editing approach to correction of speech recognition errors
by: Ziętkiewicz, Tomasz
Published: (2024)

A safety realignment framework via subspace-oriented model fusion for large language models
by: Yi, Xin, et al.
Published: (2024)

Chain-of-Though (CoT) prompting strategies for medical error detection and correction
by: Wu, Zhaolong, et al.
Published: (2024)

Progressive Residual Extraction based Pre-training for Speech Representation Learning
by: Wang, Tianrui, et al.
Published: (2024)

LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)

BootTOD: Bootstrap Task-oriented Dialogue Representations by Aligning Diverse Responses
by: Zeng, Weihao, et al.
Published: (2024)

Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction
by: He, Shidong, et al.
Published: (2026)