Saved in:
| Main Authors: | Zhi, Weihai, Guo, Jiayan, Li, Shangyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.20549 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Schema-Adaptive Tabular Representation Learning with LLMs for Generalizable Multimodal Clinical Reasoning
by: Mao, Hongxi, et al.
Published: (2026)
by: Mao, Hongxi, et al.
Published: (2026)
AutoML-Med: A Framework for Automated Machine Learning in Medical Tabular Data
by: Francia, Riccardo, et al.
Published: (2025)
by: Francia, Riccardo, et al.
Published: (2025)
MedInsightBench: Evaluating Medical Analytics Agents Through Multi-Step Insight Discovery in Multimodal Medical Data
by: Zhu, Zhenghao, et al.
Published: (2025)
by: Zhu, Zhenghao, et al.
Published: (2025)
MedGNN: Towards Multi-resolution Spatiotemporal Graph Learning for Medical Time Series Classification
by: Fan, Wei, et al.
Published: (2025)
by: Fan, Wei, et al.
Published: (2025)
MedRAX: Medical Reasoning Agent for Chest X-ray
by: Fallahpour, Adibvafa, et al.
Published: (2025)
by: Fallahpour, Adibvafa, et al.
Published: (2025)
Breaking the Factorization Barrier in Diffusion Language Models
by: Li, Ian, et al.
Published: (2026)
by: Li, Ian, et al.
Published: (2026)
GR-Agent: Adaptive Graph Reasoning Agent under Incomplete Knowledge
by: Zhou, Dongzhuoran, et al.
Published: (2025)
by: Zhou, Dongzhuoran, et al.
Published: (2025)
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
by: Zhou, Yang, et al.
Published: (2025)
by: Zhou, Yang, et al.
Published: (2025)
MedRECT: A Medical Reasoning Benchmark for Error Correction in Clinical Texts
by: Iwase, Naoto, et al.
Published: (2025)
by: Iwase, Naoto, et al.
Published: (2025)
DRPO: Efficient Reasoning via Decoupled Reward Policy Optimization
by: Li, Gang, et al.
Published: (2025)
by: Li, Gang, et al.
Published: (2025)
SemiReward: A General Reward Model for Semi-supervised Learning
by: Li, Siyuan, et al.
Published: (2023)
by: Li, Siyuan, et al.
Published: (2023)
From Federated Learning to X-Learning: Breaking the Barriers of Decentrality Through Random Walks
by: Salihovic, Allan, et al.
Published: (2025)
by: Salihovic, Allan, et al.
Published: (2025)
Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models
by: Yang, Shidong, et al.
Published: (2026)
by: Yang, Shidong, et al.
Published: (2026)
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
by: Wang, Haozhe, et al.
Published: (2026)
by: Wang, Haozhe, et al.
Published: (2026)
Breaking the Modality Barrier: Generative Modeling for Accurate Molecule Retrieval from Mass Spectra
by: Zhang, Yiwen, et al.
Published: (2025)
by: Zhang, Yiwen, et al.
Published: (2025)
DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management
by: Su, Xuerui, et al.
Published: (2025)
by: Su, Xuerui, et al.
Published: (2025)
Rewarding Graph Reasoning Process makes LLMs more Generalized Reasoners
by: Peng, Miao, et al.
Published: (2025)
by: Peng, Miao, et al.
Published: (2025)
Breaking the Reclustering Barrier in Centroid-based Deep Clustering
by: Miklautz, Lukas, et al.
Published: (2024)
by: Miklautz, Lukas, et al.
Published: (2024)
SIGMA: Refining Large Language Model Reasoning via Sibling-Guided Monte Carlo Augmentation
by: Ren, Yanwei, et al.
Published: (2025)
by: Ren, Yanwei, et al.
Published: (2025)
APLOT: Robust Reward Modeling via Adaptive Preference Learning with Optimal Transport
by: Li, Zhuo, et al.
Published: (2025)
by: Li, Zhuo, et al.
Published: (2025)
MedCLM: Learning to Localize and Reason via a CoT-Curriculum in Medical Vision-Language Models
by: Kim, Soo Yong, et al.
Published: (2025)
by: Kim, Soo Yong, et al.
Published: (2025)
Incentivizing Consistent, Effective and Scalable Reasoning Capability in Audio LLMs via Reasoning Process Rewards
by: Fan, Jiajun, et al.
Published: (2025)
by: Fan, Jiajun, et al.
Published: (2025)
CausalMed: Causality-Based Personalized Medication Recommendation Centered on Patient health state
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
Breaking the Safety-Capability Tradeoff: Reinforcement Learning with Verifiable Rewards Maintains Safety Guardrails in LLMs
by: Cho, Dongkyu Derek, et al.
Published: (2025)
by: Cho, Dongkyu Derek, et al.
Published: (2025)
Verifying Meta-Awareness via Predictive Rewards in Reasoning Models
by: Kim, Yoonjeon, et al.
Published: (2025)
by: Kim, Yoonjeon, et al.
Published: (2025)
Boosting LLM Reasoning via Human-Inspired Reward Shaping
by: Lin, Wenze, et al.
Published: (2026)
by: Lin, Wenze, et al.
Published: (2026)
RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents
by: Zhang, Zijing, et al.
Published: (2025)
by: Zhang, Zijing, et al.
Published: (2025)
Breaking the Barrier: Enhanced Utility and Robustness in Smoothed DRL Agents
by: Sun, Chung-En, et al.
Published: (2024)
by: Sun, Chung-En, et al.
Published: (2024)
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
by: Zuo, Yuxin, et al.
Published: (2025)
by: Zuo, Yuxin, et al.
Published: (2025)
MedMamba: Recasting Mamba for Medical Time Series Classification
by: He, ZhengXiao, et al.
Published: (2026)
by: He, ZhengXiao, et al.
Published: (2026)
Auxiliary Reward Generation with Transition Distance Representation Learning
by: Li, Siyuan, et al.
Published: (2024)
by: Li, Siyuan, et al.
Published: (2024)
BarrierSteer: LLM Safety via Learning Barrier Steering
by: Tran, Thanh Q., et al.
Published: (2026)
by: Tran, Thanh Q., et al.
Published: (2026)
Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning
by: Ye, Zhiling, et al.
Published: (2025)
by: Ye, Zhiling, et al.
Published: (2025)
Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
Beyond Scalar Reward Model: Learning Generative Judge from Preference Data
by: Ye, Ziyi, et al.
Published: (2024)
by: Ye, Ziyi, et al.
Published: (2024)
Generalizing Behavior via Inverse Reinforcement Learning with Closed-Form Reward Centroids
by: Lazzati, Filippo, et al.
Published: (2025)
by: Lazzati, Filippo, et al.
Published: (2025)
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
by: Lee, Younghwan, et al.
Published: (2025)
by: Lee, Younghwan, et al.
Published: (2025)
CafeMed: Causal Attention Fusion Enhanced Medication Recommendation
by: Ren, Kelin, et al.
Published: (2025)
by: Ren, Kelin, et al.
Published: (2025)
MedRep: Medical Concept Representation for General Electronic Health Record Foundation Models
by: Kim, Junmo, et al.
Published: (2025)
by: Kim, Junmo, et al.
Published: (2025)
BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs
by: Zhao, Zhixiong, et al.
Published: (2026)
by: Zhao, Zhixiong, et al.
Published: (2026)
Similar Items
-
Schema-Adaptive Tabular Representation Learning with LLMs for Generalizable Multimodal Clinical Reasoning
by: Mao, Hongxi, et al.
Published: (2026) -
AutoML-Med: A Framework for Automated Machine Learning in Medical Tabular Data
by: Francia, Riccardo, et al.
Published: (2025) -
MedInsightBench: Evaluating Medical Analytics Agents Through Multi-Step Insight Discovery in Multimodal Medical Data
by: Zhu, Zhenghao, et al.
Published: (2025) -
MedGNN: Towards Multi-resolution Spatiotemporal Graph Learning for Medical Time Series Classification
by: Fan, Wei, et al.
Published: (2025) -
MedRAX: Medical Reasoning Agent for Chest X-ray
by: Fallahpour, Adibvafa, et al.
Published: (2025)