Guardado en:
| Autores principales: | Zhu, Erle, Liu, Yadi, Zhang, Zhe, Li, Xujun, Zhou, Jin, Yu, Xinjie, Huang, Minlie, Wang, Hongning |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2501.10768 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
SkillEvolver: Skill Learning as a Meta-Skill
por: Zhang, Genrui, et al.
Publicado: (2026)
por: Zhang, Genrui, et al.
Publicado: (2026)
MAPS: Multi-Agent Personality Shaping for Collaborative Reasoning
por: Zhang, Jian, et al.
Publicado: (2025)
por: Zhang, Jian, et al.
Publicado: (2025)
Grounding LLMs in Scientific Discovery via Embodied Actions
por: Zhang, Bo, et al.
Publicado: (2026)
por: Zhang, Bo, et al.
Publicado: (2026)
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
por: Yang, Junxiao, et al.
Publicado: (2025)
por: Yang, Junxiao, et al.
Publicado: (2025)
Trust-Region Adaptive Policy Optimization
por: Su, Mingyu, et al.
Publicado: (2025)
por: Su, Mingyu, et al.
Publicado: (2025)
LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
por: Gui, Jiayi, et al.
Publicado: (2024)
por: Gui, Jiayi, et al.
Publicado: (2024)
PhysUniBench: A Multi-Modal Physics Reasoning Benchmark at Undergraduate Level
por: Wang, Lintao, et al.
Publicado: (2025)
por: Wang, Lintao, et al.
Publicado: (2025)
BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs
por: Yang, Junxiao, et al.
Publicado: (2025)
por: Yang, Junxiao, et al.
Publicado: (2025)
Geometric Mixture-of-Experts with Curvature-Guided Adaptive Routing for Graph Representation Learning
por: Cao, Haifang, et al.
Publicado: (2026)
por: Cao, Haifang, et al.
Publicado: (2026)
MAPS: Multi-Fidelity AI-Augmented Photonic Simulation and Inverse Design Infrastructure
por: Ma, Pingchuan, et al.
Publicado: (2025)
por: Ma, Pingchuan, et al.
Publicado: (2025)
Geo-Expert: Towards Expert-Level Geological Reasoning via Parameter-Efficient Fine-Tuning
por: Guo, Chenyou, et al.
Publicado: (2026)
por: Guo, Chenyou, et al.
Publicado: (2026)
Hierarchical Attacks for Multi-Modal Multi-Agent Reasoning
por: Zhou, Hao, et al.
Publicado: (2026)
por: Zhou, Hao, et al.
Publicado: (2026)
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
por: Cheng, Jiale, et al.
Publicado: (2024)
por: Cheng, Jiale, et al.
Publicado: (2024)
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
por: Cheng, Jiale, et al.
Publicado: (2024)
por: Cheng, Jiale, et al.
Publicado: (2024)
LongSafety: Evaluating Long-Context Safety of Large Language Models
por: Lu, Yida, et al.
Publicado: (2025)
por: Lu, Yida, et al.
Publicado: (2025)
Large Images are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting
por: Zhu, Lingting, et al.
Publicado: (2025)
por: Zhu, Lingting, et al.
Publicado: (2025)
Stop Before You Fail: Operational Capability Boundaries for Mitigating Unproductive Reasoning in Large Reasoning Models
por: Zhang, Qingjie, et al.
Publicado: (2025)
por: Zhang, Qingjie, et al.
Publicado: (2025)
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
por: Liao, Zhichao, et al.
Publicado: (2025)
por: Liao, Zhichao, et al.
Publicado: (2025)
SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning
por: Xiang, Kun, et al.
Publicado: (2026)
por: Xiang, Kun, et al.
Publicado: (2026)
Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts
por: Liu, Zhenghao, et al.
Publicado: (2025)
por: Liu, Zhenghao, et al.
Publicado: (2025)
UrbanMoE: A Sparse Multi-Modal Mixture-of-Experts Framework for Multi-Task Urban Region Profiling
por: Liu, Pingping, et al.
Publicado: (2026)
por: Liu, Pingping, et al.
Publicado: (2026)
Data-Efficient RLVR via Off-Policy Influence Guidance
por: Zhu, Erle, et al.
Publicado: (2025)
por: Zhu, Erle, et al.
Publicado: (2025)
FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
por: Hu, Liang, et al.
Publicado: (2025)
por: Hu, Liang, et al.
Publicado: (2025)
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
por: Zuo, Yuxin, et al.
Publicado: (2025)
por: Zuo, Yuxin, et al.
Publicado: (2025)
Knowledge Graph-Driven Expert-Level Reasoning for Neuroscience
por: Stephen, Jake, et al.
Publicado: (2026)
por: Stephen, Jake, et al.
Publicado: (2026)
Benchmarking Complex Instruction-Following with Multiple Constraints Composition
por: Wen, Bosi, et al.
Publicado: (2024)
por: Wen, Bosi, et al.
Publicado: (2024)
Agentic Active Omni-Modal Perception for Multi-Hop Audio-Visual Reasoning
por: Xu, Ke, et al.
Publicado: (2026)
por: Xu, Ke, et al.
Publicado: (2026)
DPRM: A Dual Implicit Process Reward Model in Multi-Hop Question Answering
por: Wang, Xinyi, et al.
Publicado: (2025)
por: Wang, Xinyi, et al.
Publicado: (2025)
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
por: Jing, Liqiang, et al.
Publicado: (2024)
por: Jing, Liqiang, et al.
Publicado: (2024)
IntentionESC: An Intention-Centered Framework for Enhancing Emotional Support in Dialogue Systems
por: Zhang, Xinjie, et al.
Publicado: (2025)
por: Zhang, Xinjie, et al.
Publicado: (2025)
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
por: Ke, Pei, et al.
Publicado: (2023)
por: Ke, Pei, et al.
Publicado: (2023)
Multi-Modal Time Series Prediction via Mixture of Modulated Experts
por: Zhang, Lige, et al.
Publicado: (2026)
por: Zhang, Lige, et al.
Publicado: (2026)
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation
por: Zhang, Shengzhe, et al.
Publicado: (2025)
por: Zhang, Shengzhe, et al.
Publicado: (2025)
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
por: Yang, Junxiao, et al.
Publicado: (2026)
por: Yang, Junxiao, et al.
Publicado: (2026)
When Does Persona Prompting Actually Help? A Retrieval and Metric Analysis of Expert Role Injection in LLMs
por: Xiao, Shuai, et al.
Publicado: (2026)
por: Xiao, Shuai, et al.
Publicado: (2026)
ProReason: Multi-Modal Proactive Reasoning with Decoupled Eyesight and Wisdom
por: Zhou, Jingqi, et al.
Publicado: (2024)
por: Zhou, Jingqi, et al.
Publicado: (2024)
Asymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization
por: Huang, Bin, et al.
Publicado: (2026)
por: Huang, Bin, et al.
Publicado: (2026)
MSQA: Benchmarking LLMs on Graduate-Level Materials Science Reasoning and Knowledge
por: Cheung, Jerry Junyang, et al.
Publicado: (2025)
por: Cheung, Jerry Junyang, et al.
Publicado: (2025)
How Far Are VLMs from Privacy Awareness in the Physical World? An Empirical Study
por: Wang, Junran, et al.
Publicado: (2026)
por: Wang, Junran, et al.
Publicado: (2026)
RE-MCDF: Closed-Loop Multi-Expert LLM Reasoning for Knowledge-Grounded Clinical Diagnosis
por: Shen, Shaowei, et al.
Publicado: (2026)
por: Shen, Shaowei, et al.
Publicado: (2026)
Ejemplares similares
-
SkillEvolver: Skill Learning as a Meta-Skill
por: Zhang, Genrui, et al.
Publicado: (2026) -
MAPS: Multi-Agent Personality Shaping for Collaborative Reasoning
por: Zhang, Jian, et al.
Publicado: (2025) -
Grounding LLMs in Scientific Discovery via Embodied Actions
por: Zhang, Bo, et al.
Publicado: (2026) -
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
por: Yang, Junxiao, et al.
Publicado: (2025) -
Trust-Region Adaptive Policy Optimization
por: Su, Mingyu, et al.
Publicado: (2025)