Saved in:
| Main Authors: | Rakib, Tazeek Bin Abdur, Mehrish, Ambuj, Soon, Lay-Ki, Lim, Wern Han, Poria, Soujanya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.17795 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PROEMO: Prompt-Driven Text-to-Speech Synthesis Based on Emotion and Intensity Control
by: Zhang, Shaozuo, et al.
Published: (2025)
by: Zhang, Shaozuo, et al.
Published: (2025)
Inference Time Alignment with Reward-Guided Tree Search
by: Hung, Chia-Yu, et al.
Published: (2024)
by: Hung, Chia-Yu, et al.
Published: (2024)
HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks
by: Li, Yingting, et al.
Published: (2024)
by: Li, Yingting, et al.
Published: (2024)
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation
by: Li, Yingting, et al.
Published: (2024)
by: Li, Yingting, et al.
Published: (2024)
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
Improving Text-To-Audio Models with Synthetic Captions
by: Kong, Zhifeng, et al.
Published: (2024)
by: Kong, Zhifeng, et al.
Published: (2024)
Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training
by: Melechovsky, Jan, et al.
Published: (2024)
by: Melechovsky, Jan, et al.
Published: (2024)
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization
by: Hung, Chia-Yu, et al.
Published: (2024)
by: Hung, Chia-Yu, et al.
Published: (2024)
'Finance Wizard' at the FinLLM Challenge Task: Financial Text Summarization
by: Lee, Meisin, et al.
Published: (2024)
by: Lee, Meisin, et al.
Published: (2024)
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder
by: Melechovsky, Jan, et al.
Published: (2022)
by: Melechovsky, Jan, et al.
Published: (2022)
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
by: Melechovsky, Jan, et al.
Published: (2024)
by: Melechovsky, Jan, et al.
Published: (2024)
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
by: Melechovsky, Jan, et al.
Published: (2025)
by: Melechovsky, Jan, et al.
Published: (2025)
Exact Flow Linear Attention: Exact Solution from Continuous-Time Dynamics
by: Lei, Jingdi, et al.
Published: (2025)
by: Lei, Jingdi, et al.
Published: (2025)
Towards Robust Instruction Tuning on Multimodal Large Language Models
by: Han, Wei, et al.
Published: (2024)
by: Han, Wei, et al.
Published: (2024)
PREMISE: Matching-based Prediction for Accurate Review Recommendation
by: Han, Wei, et al.
Published: (2025)
by: Han, Wei, et al.
Published: (2025)
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
by: Kang, Jaeyong, et al.
Published: (2023)
by: Kang, Jaeyong, et al.
Published: (2023)
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
by: Bhardwaj, Rishabh, et al.
Published: (2024)
by: Bhardwaj, Rishabh, et al.
Published: (2024)
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
by: Deep, Pala Tej, et al.
Published: (2024)
by: Deep, Pala Tej, et al.
Published: (2024)
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
by: Pala, Tej Deep, et al.
Published: (2025)
by: Pala, Tej Deep, et al.
Published: (2025)
Online Learning with Set-Valued Feedback
by: Raman, Vinod, et al.
Published: (2023)
by: Raman, Vinod, et al.
Published: (2023)
Bayesian-Symbolic Integration for Uncertainty-Aware Parking Prediction
by: Nezhadettehad, Alireza, et al.
Published: (2026)
by: Nezhadettehad, Alireza, et al.
Published: (2026)
Stacked from One: Multi-Scale Self-Injection for Context Window Extension
by: Han, Wei, et al.
Published: (2026)
by: Han, Wei, et al.
Published: (2026)
Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming
by: Han, Vernon Toh Yan, et al.
Published: (2024)
by: Han, Vernon Toh Yan, et al.
Published: (2024)
Sowing the Wind, Reaping the Whirlwind: The Impact of Editing Language Models
by: Hazra, Rima, et al.
Published: (2024)
by: Hazra, Rima, et al.
Published: (2024)
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
by: Hazra, Rima, et al.
Published: (2024)
by: Hazra, Rima, et al.
Published: (2024)
Not All Votes Count! Programs as Verifiers Improve Self-Consistency of Language Models for Math Reasoning
by: Toh, Vernon Y. H., et al.
Published: (2024)
by: Toh, Vernon Y. H., et al.
Published: (2024)
Two are better than one: Context window extension with multi-grained self-injection
by: Han, Wei, et al.
Published: (2024)
by: Han, Wei, et al.
Published: (2024)
Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
by: Xuan, Richmond Sin Jing, et al.
Published: (2026)
by: Xuan, Richmond Sin Jing, et al.
Published: (2026)
Pixel-Level Reasoning Segmentation via Multi-turn Conversations
by: Cai, Dexian, et al.
Published: (2025)
by: Cai, Dexian, et al.
Published: (2025)
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models
by: Chia, Yew Ken, et al.
Published: (2024)
by: Chia, Yew Ken, et al.
Published: (2024)
Self-Adaptive Sampling for Efficient Video Question-Answering on Image--Text Models
by: Han, Wei, et al.
Published: (2023)
by: Han, Wei, et al.
Published: (2023)
Adaptive operator selection utilising generalised experience
by: Aydin, Mehmet Emin, et al.
Published: (2023)
by: Aydin, Mehmet Emin, et al.
Published: (2023)
Toward Robust Multimodal Learning using Multimodal Foundational Models
by: Zhao, Xianbing, et al.
Published: (2024)
by: Zhao, Xianbing, et al.
Published: (2024)
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
by: Lei, Jingdi, et al.
Published: (2025)
by: Lei, Jingdi, et al.
Published: (2025)
How well ChatGPT understand Malaysian English? An Evaluation on Named Entity Recognition and Relation Extraction
by: Chanthran, Mohan Raj, et al.
Published: (2023)
by: Chanthran, Mohan Raj, et al.
Published: (2023)
Document-Level Zero-Shot Relation Extraction with Entity Side Information
by: Chanthran, Mohan Raj, et al.
Published: (2026)
by: Chanthran, Mohan Raj, et al.
Published: (2026)
Malaysian English News Decoded: A Linguistic Resource for Named Entity and Relation Extraction
by: Chanthran, Mohan Raj, et al.
Published: (2024)
by: Chanthran, Mohan Raj, et al.
Published: (2024)
Automating IRAC Analysis in Malaysian Contract Law using a Semi-Structured Knowledge Base
by: Kang, Xiaoxi, et al.
Published: (2024)
by: Kang, Xiaoxi, et al.
Published: (2024)
Bridging the Gap: Transfer Learning from English PLMs to Malaysian English
by: Chanthran, Mohan Raj, et al.
Published: (2024)
by: Chanthran, Mohan Raj, et al.
Published: (2024)
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
by: Toh, Vernon Y. H., et al.
Published: (2025)
by: Toh, Vernon Y. H., et al.
Published: (2025)
Similar Items
-
PROEMO: Prompt-Driven Text-to-Speech Synthesis Based on Emotion and Intensity Control
by: Zhang, Shaozuo, et al.
Published: (2025) -
Inference Time Alignment with Reward-Guided Tree Search
by: Hung, Chia-Yu, et al.
Published: (2024) -
HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks
by: Li, Yingting, et al.
Published: (2024) -
Leveraging Parameter-Efficient Transfer Learning for Multi-Lingual Text-to-Speech Adaptation
by: Li, Yingting, et al.
Published: (2024) -
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models
by: Li, Xiang, et al.
Published: (2024)