:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Peng, Kun, Tan, Conghui, Liu, Yu, Tang, Guohua, Sun, Zhongqian, Yang, Wei, Zhu, Zining, Jiang, Lei, Liu, Yanbing, Peng, Hao
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.08533
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models
by: Liao, Yi, et al.
Published: (2025)

Dialogues Aspect-based Sentiment Quadruple Extraction via Structural Entropy Minimization Partitioning
by: Peng, Kun, et al.
Published: (2025)

What-If Analysis of Large Language Models: Explore the Game World Using Proactive Thinking
by: Sui, Yuan, et al.
Published: (2025)

Emotion Transfer with Enhanced Prototype for Unseen Emotion Recognition in Conversation
by: Peng, Kun, et al.
Published: (2025)

T-T: Table Transformer for Tagging-based Aspect Sentiment Triplet Extraction
by: Peng, Kun, et al.
Published: (2025)

Superior energy storage performance in NaNbO 3 ‐based lead‐free ceramics under low electric field
by: Kun Liu, et al.
Published: (2024)

RC-GRPO: Reward-Conditioned Group Relative Policy Optimization for Multi-Turn Tool Calling Agents
by: Zhong, Haitian, et al.
Published: (2026)

BEATS: Optimizing LLM Mathematical Capabilities with BackVerify and Adaptive Disambiguate based Efficient Tree Search
by: Sun, Linzhuang, et al.
Published: (2024)

DanceGRPO: Unleashing GRPO on Visual Generation
by: Xue, Zeyue, et al.
Published: (2025)

Analysis of minimum orbital periods around d-dimensional charged black holes
by: Peng, Yan, et al.
Published: (2025)

Upper bound on the radius of the innermost photonsphere in the regular compact star spacetime
by: Liu, Guohua, et al.
Published: (2024)

Bounds on the minimum orbital periods of non-singular Hayward and Bardeen black holes
by: Liu, Guohua, et al.
Published: (2025)

MuVaC: A Variational Causal Framework for Multimodal Sarcasm Understanding in Dialogues
by: Guo, Diandian, et al.
Published: (2026)

Large Language Models as Agents in Two-Player Games
by: Liu, Yang, et al.
Published: (2024)

EMIT: Enhancing MLLMs for Industrial Anomaly Detection via Difficulty-Aware GRPO
by: Guan, Wei, et al.
Published: (2025)

Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
by: Hong, Haoyang, et al.
Published: (2025)

GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents
by: Wu, Xiongbin, et al.
Published: (2026)

SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization
by: Zheng, Zhi, et al.
Published: (2025)

Adaptive Content Restriction for Large Language Models via Suffix Optimization
by: Li, Yige, et al.
Published: (2025)

TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Guided Optimization
by: Jing, Xuepeng, et al.
Published: (2026)

IRPO: Boosting Image Restoration via Post-training GRPO
by: Xu, Haoxuan, et al.
Published: (2025)

LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation
by: Ye, Yuxiao, et al.
Published: (2026)

Trans-RAG: Query-Centric Vector Transformation for Secure Cross-Organizational Retrieval
by: Liu, Yu, et al.
Published: (2026)

Table-Filling via Mean Teacher for Cross-domain Aspect Sentiment Triplet Extraction
by: Peng, Kun, et al.
Published: (2024)

PRISMA: Reinforcement Learning Guided Two-Stage Policy Optimization in Multi-Agent Architecture for Open-Domain Multi-Hop Question Answering
by: Liu, Yu, et al.
Published: (2026)

PerPO: Perceptual Preference Optimization via Discriminative Rewarding
by: Zhu, Zining, et al.
Published: (2025)

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering
by: Cai, Yuzhu, et al.
Published: (2026)

A Complete Mental Temporal Logic for Intelligent Agent
by: Cao, Zining
Published: (2025)

Scaf-GRPO: Scaffolded Group Relative Policy Optimization for Enhancing LLM Reasoning
by: Zhang, Xichen, et al.
Published: (2025)

Design and Optimization of Reinforcement Learning-Based Agents in Text-Based Games
by: Wang, Haonan, et al.
Published: (2025)

Graph-GRPO: Stabilizing Multi-Agent Topology Learning via Group Relative Policy Optimization
by: Cang, Yueyang, et al.
Published: (2026)

Semantic Reformulation Entropy for Robust Hallucination Detection in QA Tasks
by: Tong, Chaodong, et al.
Published: (2025)

OPERA: A Reinforcement Learning--Enhanced Orchestrated Planner-Executor Architecture for Reasoning-Oriented Multi-Hop Retrieval
by: Liu, Yu, et al.
Published: (2025)

GRPO-GCC: Enhancing Cooperation in Spatial Public Goods Games via Group Relative Policy Optimization with Global Cooperation Constraint
by: Yang, Zhaoqilin, et al.
Published: (2025)

Delay-Aware Multi-Agent Reinforcement Learning for Cooperative Adaptive Cruise Control with Model-based Stability Enhancement
by: Liu, Jiaqi, et al.
Published: (2024)

EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
by: Fu, Zhongqian, et al.
Published: (2025)

DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
by: Liu, Henglin, et al.
Published: (2025)

SI‐FloatDet: A Visual Inspection Method for Water Surface Cleaning Robots Based on Shallow Information Injection and Adaptive Spatial Refinement
by: Guohua Yu, et al.
Published: (2025)

LithoGRPO: Fast Inverse Lithography via GRPO Reinforced Flow Matching
by: Lai, Yao, et al.
Published: (2026)

Knowledge Dependency Estimation for Reliable Question Answering
by: Tong, Chaodong, et al.
Published: (2026)