Saved in:
| Main Authors: | Lu, Hongliang, Wen, Yuhang, Cheng, Pengyu, Ding, Ruijin, Guo, Jiaqi, Xu, Haotian, Wang, Chutian, Chen, Haonan, Jiang, Xiaoxi, Jiang, Guanjun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.18821 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
by: Li, Zhuo, et al.
Published: (2026)
by: Li, Zhuo, et al.
Published: (2026)
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
by: Ni, Jingwei, et al.
Published: (2026)
by: Ni, Jingwei, et al.
Published: (2026)
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
by: Cui, Sijia, et al.
Published: (2026)
by: Cui, Sijia, et al.
Published: (2026)
Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
by: Chen, Yafeng, et al.
Published: (2025)
by: Chen, Yafeng, et al.
Published: (2025)
Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models
by: Jia, Mengni, et al.
Published: (2025)
by: Jia, Mengni, et al.
Published: (2025)
Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidance
by: Li, Zhuo, et al.
Published: (2025)
by: Li, Zhuo, et al.
Published: (2025)
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
by: Sun, Hao, et al.
Published: (2025)
by: Sun, Hao, et al.
Published: (2025)
AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis
by: Chen, Xuanzhong, et al.
Published: (2025)
by: Chen, Xuanzhong, et al.
Published: (2025)
S-Agents: Self-organizing Agents in Open-ended Environments
by: Chen, Jiaqi, et al.
Published: (2024)
by: Chen, Jiaqi, et al.
Published: (2024)
Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization
by: Yao, Yihang, et al.
Published: (2026)
by: Yao, Yihang, et al.
Published: (2026)
Pushing Frontiers for Proteoglycans
by: Marissa L. Maciej‐Hulme
Published: (2026)
by: Marissa L. Maciej‐Hulme
Published: (2026)
How Can Haptic Feedback Assist People with Blind and Low Vision (BLV): A Systematic Literature Review
by: Jiang, Chutian, et al.
Published: (2024)
by: Jiang, Chutian, et al.
Published: (2024)
IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
by: Li, Yifan, et al.
Published: (2025)
by: Li, Yifan, et al.
Published: (2025)
Iterative Data-Consistent Inversion with Multiple Push-forward Constraints
by: Jiang, Tianyi, et al.
Published: (2026)
by: Jiang, Tianyi, et al.
Published: (2026)
Self-playing Adversarial Language Game Enhances LLM Reasoning
by: Cheng, Pengyu, et al.
Published: (2024)
by: Cheng, Pengyu, et al.
Published: (2024)
Writing-Zero: Bridge the Gap Between Non-verifiable Tasks and Verifiable Rewards
by: Jia, Ruipeng, et al.
Published: (2025)
by: Jia, Ruipeng, et al.
Published: (2025)
Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails
by: Han, Siwei, et al.
Published: (2025)
by: Han, Siwei, et al.
Published: (2025)
Pushing Radar Odometry Beyond the Pavement: Current Capabilities and Challenges
by: Kolhe, Shaunak, et al.
Published: (2026)
by: Kolhe, Shaunak, et al.
Published: (2026)
QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence
by: Lin, Zhichao, et al.
Published: (2026)
by: Lin, Zhichao, et al.
Published: (2026)
Almost sharp global wellposedness and scattering for the defocusing conformal wave equation on the hyperbolic space
by: Ma, Chutian
Published: (2023)
by: Ma, Chutian
Published: (2023)
DOCTOR: Dynamic On-Chip Temporal Variation Remediation Toward Self-Corrected Photonic Tensor Accelerators
by: Lu, Haotian, et al.
Published: (2024)
by: Lu, Haotian, et al.
Published: (2024)
Ola: Pushing the Frontiers of Omni-Modal Language Model
by: Liu, Zuyan, et al.
Published: (2025)
by: Liu, Zuyan, et al.
Published: (2025)
Answer First, Reason Later: Aligning Search Relevance via Mode-Balanced Reinforcement Learning
by: Zhang, Shijie, et al.
Published: (2026)
by: Zhang, Shijie, et al.
Published: (2026)
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models
by: Dong, Guanting, et al.
Published: (2024)
by: Dong, Guanting, et al.
Published: (2024)
Forecasting Frontier Language Model Agent Capabilities
by: Pimpale, Govind, et al.
Published: (2025)
by: Pimpale, Govind, et al.
Published: (2025)
Multi-Modality Spatio-Temporal Forecasting via Self-Supervised Learning
by: Deng, Jiewen, et al.
Published: (2024)
by: Deng, Jiewen, et al.
Published: (2024)
Pushing the Frontier on Approximate EFX Allocations
by: Amanatidis, Georgios, et al.
Published: (2024)
by: Amanatidis, Georgios, et al.
Published: (2024)
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
by: Comanici, Gheorghe, et al.
Published: (2025)
by: Comanici, Gheorghe, et al.
Published: (2025)
Rationale Matters: Learning Transferable Rubrics via Proxy-Guided Critique for VLM Reward Models
by: Qiu, Weijie, et al.
Published: (2026)
by: Qiu, Weijie, et al.
Published: (2026)
Open Rubric System: Scaling Reinforcement Learning with Pairwise Adaptive Rubric
by: Jia, Ruipeng, et al.
Published: (2026)
by: Jia, Ruipeng, et al.
Published: (2026)
FedAR: Addressing Client Unavailability in Federated Learning with Local Update Approximation and Rectification
by: Jiang, Chutian, et al.
Published: (2024)
by: Jiang, Chutian, et al.
Published: (2024)
Physical Neural Networks with Self-Learning Capabilities
by: Yu, Weichao, et al.
Published: (2024)
by: Yu, Weichao, et al.
Published: (2024)
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation through Pretraining, SFT, and RL
by: Wang, Junke, et al.
Published: (2025)
by: Wang, Junke, et al.
Published: (2025)
Modularity in Argyres-Douglas Theories with $a=c$
by: Jiang, Hongliang
Published: (2024)
by: Jiang, Hongliang
Published: (2024)
Time-reversal invariant TQFTs from self-mirror symmetric SCFTs
by: Jiang, Hongliang
Published: (2024)
by: Jiang, Hongliang
Published: (2024)
D1-D5 CFT data from $AdS_3 \times S^3$ Virasoro-Shapiro amplitude
by: Jiang, Hongliang
Published: (2026)
by: Jiang, Hongliang
Published: (2026)
Macdonald Index from VOA and Graded Unitarity
by: Jiang, Hongliang
Published: (2026)
by: Jiang, Hongliang
Published: (2026)
Probing the Mid-level Vision Capabilities of Self-Supervised Learning
by: Chen, Xuweiyi, et al.
Published: (2024)
by: Chen, Xuweiyi, et al.
Published: (2024)
M3Ret: Unleashing Zero-shot Multimodal Medical Image Retrieval via Self-Supervision
by: Liu, Che, et al.
Published: (2025)
by: Liu, Che, et al.
Published: (2025)
Mem$^2$Evolve: Towards Self-Evolving Agents via Co-Evolutionary Capability Expansion and Experience Distillation
by: Cheng, Zihao, et al.
Published: (2026)
by: Cheng, Zihao, et al.
Published: (2026)
Similar Items
-
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
by: Li, Zhuo, et al.
Published: (2026) -
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
by: Ni, Jingwei, et al.
Published: (2026) -
CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR
by: Cui, Sijia, et al.
Published: (2026) -
Pushing the Frontiers of Self-Distillation Prototypes Network with Dimension Regularization and Score Normalization
by: Chen, Yafeng, et al.
Published: (2025) -
Bringing Stability to Diffusion: Decomposing and Reducing Variance of Training Masked Diffusion Models
by: Jia, Mengni, et al.
Published: (2025)