Saved in:
| Main Authors: | Song, Maojia, Pala, Tej Deep, Zhou, Ruiwen, Jin, Weisheng, Zadeh, Amir, Li, Chuan, Herremans, Dorien, Poria, Soujanya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.18321 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
by: Jin, Weisheng, et al.
Published: (2025)
by: Jin, Weisheng, et al.
Published: (2025)
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
by: Pala, Tej Deep, et al.
Published: (2025)
by: Pala, Tej Deep, et al.
Published: (2025)
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
by: Deep, Pala Tej, et al.
Published: (2024)
by: Deep, Pala Tej, et al.
Published: (2024)
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
by: Kang, Jaeyong, et al.
Published: (2023)
by: Kang, Jaeyong, et al.
Published: (2023)
Lessons from Training Grounded LLMs with Verifiable Rewards
by: Sim, Shang Hong, et al.
Published: (2025)
by: Sim, Shang Hong, et al.
Published: (2025)
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique
by: Pala, Tej Deep, et al.
Published: (2024)
by: Pala, Tej Deep, et al.
Published: (2024)
Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned
by: Ong, Brandon, et al.
Published: (2025)
by: Ong, Brandon, et al.
Published: (2025)
Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics
by: Song, Maojia, et al.
Published: (2025)
by: Song, Maojia, et al.
Published: (2025)
NORA-1.5: A Vision-Language-Action Model Trained using World Model- and Action-based Preference Rewards
by: Hung, Chia-Yu, et al.
Published: (2025)
by: Hung, Chia-Yu, et al.
Published: (2025)
Mustango: Toward Controllable Text-to-Music Generation
by: Melechovsky, Jan, et al.
Published: (2023)
by: Melechovsky, Jan, et al.
Published: (2023)
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
by: Liu, Renhang, et al.
Published: (2025)
by: Liu, Renhang, et al.
Published: (2025)
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
by: Sun, Qi, et al.
Published: (2024)
by: Sun, Qi, et al.
Published: (2024)
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
by: Song, Maojia, et al.
Published: (2024)
by: Song, Maojia, et al.
Published: (2024)
OffTopicEval: When Large Language Models Enter the Wrong Chat, Almost Always!
by: Lei, Jingdi, et al.
Published: (2025)
by: Lei, Jingdi, et al.
Published: (2025)
Epistemic Context Learning: Building Trust the Right Way in LLM-Based Multi-Agent Systems
by: Zhou, Ruiwen, et al.
Published: (2026)
by: Zhou, Ruiwen, et al.
Published: (2026)
Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
by: Kang, Jaeyong, et al.
Published: (2024)
by: Kang, Jaeyong, et al.
Published: (2024)
DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage
by: Wang, Kyra, et al.
Published: (2024)
by: Wang, Kyra, et al.
Published: (2024)
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
by: Zou, Yanzhao, et al.
Published: (2022)
by: Zou, Yanzhao, et al.
Published: (2022)
Towards Unified Music Emotion Recognition across Dimensional and Categorical Models
by: Kang, Jaeyong, et al.
Published: (2025)
by: Kang, Jaeyong, et al.
Published: (2025)
Aligning Generative Music AI with Human Preferences: Methods and Challenges
by: Herremans, Dorien, et al.
Published: (2025)
by: Herremans, Dorien, et al.
Published: (2025)
DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts
by: Ong, Joel, et al.
Published: (2024)
by: Ong, Joel, et al.
Published: (2024)
APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
by: Husain, Jaavid Aktar, et al.
Published: (2026)
by: Husain, Jaavid Aktar, et al.
Published: (2026)
Forecasting Bitcoin volatility spikes from whale transactions and CryptoQuant data using Synthesizer Transformer models
by: Herremans, Dorien, et al.
Published: (2022)
by: Herremans, Dorien, et al.
Published: (2022)
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
by: Hung, Chia-Yu, et al.
Published: (2025)
by: Hung, Chia-Yu, et al.
Published: (2025)
When Drawing Is Not Enough: Exploring Spontaneous Speech with Sketch for Intent Alignment in Multimodal LLMs
by: Shi, Weiyan, et al.
Published: (2026)
by: Shi, Weiyan, et al.
Published: (2026)
KARMA-MV: A Benchmark for Causal Question Answering on Music Videos
by: Ghosh, Archishman, et al.
Published: (2026)
by: Ghosh, Archishman, et al.
Published: (2026)
BandCondiNet: Parallel Transformers-based Conditional Popular Music Generation with Multi-View Features
by: Luo, Jing, et al.
Published: (2024)
by: Luo, Jing, et al.
Published: (2024)
Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
by: Liu, Renhang, et al.
Published: (2024)
by: Liu, Renhang, et al.
Published: (2024)
Digital Lifelong Learning in the Age of AI: Trends and Insights
by: Puri, Geeta, et al.
Published: (2026)
by: Puri, Geeta, et al.
Published: (2026)
MidiCaps: A large-scale MIDI dataset with text captions
by: Melechovsky, Jan, et al.
Published: (2024)
by: Melechovsky, Jan, et al.
Published: (2024)
MIRFLEX: Music Information Retrieval Feature Library for Extraction
by: Chopra, Anuradha, et al.
Published: (2024)
by: Chopra, Anuradha, et al.
Published: (2024)
Smart Timing for Mining: A Deep Learning Framework for Bitcoin Hardware ROI Prediction
by: Wickramasinghe, Sithumi, et al.
Published: (2025)
by: Wickramasinghe, Sithumi, et al.
Published: (2025)
MERIT: Learning Disentangled Music Representations for Audio Similarity
by: Roy, Abhinaba, et al.
Published: (2026)
by: Roy, Abhinaba, et al.
Published: (2026)
SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning
by: Chopra, Anuradha, et al.
Published: (2025)
by: Chopra, Anuradha, et al.
Published: (2025)
Text2midi-InferAlign: Improving Symbolic Music Generation with Inference-Time Alignment
by: Roy, Abhinaba, et al.
Published: (2025)
by: Roy, Abhinaba, et al.
Published: (2025)
To Embody or Not: The Effect Of Embodiment On User Perception Of LLM-based Conversational Agents
by: Wang, Kyra, et al.
Published: (2025)
by: Wang, Kyra, et al.
Published: (2025)
Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models
by: Chia, Yew Ken, et al.
Published: (2024)
by: Chia, Yew Ken, et al.
Published: (2024)
Scaffolded Vulnerability: Chatbot-Mediated Reciprocal Self-Disclosure and Need-Supportive Interaction in Couples
by: Jiang, Zhuoqun, et al.
Published: (2026)
by: Jiang, Zhuoqun, et al.
Published: (2026)
Exact Flow Linear Attention: Exact Solution from Continuous-Time Dynamics
by: Lei, Jingdi, et al.
Published: (2025)
by: Lei, Jingdi, et al.
Published: (2025)
Towards Robust Instruction Tuning on Multimodal Large Language Models
by: Han, Wei, et al.
Published: (2024)
by: Han, Wei, et al.
Published: (2024)
Similar Items
-
PromptDistill: Query-based Selective Token Retention in Intermediate Layers for Efficient Large Language Model Inference
by: Jin, Weisheng, et al.
Published: (2025) -
Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision
by: Pala, Tej Deep, et al.
Published: (2025) -
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
by: Deep, Pala Tej, et al.
Published: (2024) -
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model
by: Kang, Jaeyong, et al.
Published: (2023) -
Lessons from Training Grounded LLMs with Verifiable Rewards
by: Sim, Shang Hong, et al.
Published: (2025)