:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Jiaqi, Qiu, Shi, Li, Mairui, Li, Bingzhou, Ji, Haonian, Han, Siwei, Ye, Xinyu, Xia, Peng, Dong, Zihan, Chen, Meng, Zhang, Congyu, Zhang, Letian, Chen, Guiming, Tu, Haoqin, Yang, Xinyu, Feng, Lu, Zhao, Xujiang, Chen, Haifeng, Zhou, Jiawei, Wang, Xiao, Zhang, Weitong, Zhu, Hongtu, Li, Yun, Mei, Jieru, Fei, Hongliang, Zhang, Jiaheng, Li, Linjie, Zhang, Linjun, Zhou, Yuyin, Wang, Sheng, Xiong, Caiming, Zou, James, Zheng, Zeyu, Xie, Cihang, Ding, Mingyu, Yao, Huaxiu
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.20025
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
by: Xia, Peng, et al.
Published: (2026)

ClawArena: Benchmarking AI Agents in Evolving Information Environments
by: Ji, Haonian, et al.
Published: (2026)

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw
by: Wang, Zijun, et al.
Published: (2026)

Kestrel: Grounding Self-Refinement for LVLM Hallucination Mitigation
by: Mao, Jiawei, et al.
Published: (2026)

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning
by: Wu, Juncheng, et al.
Published: (2026)

3D-TransUNet for Brain Metastases Segmentation in the BraTS2023 Challenge
by: Yang, Siwei, et al.
Published: (2024)

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning
by: Liu, Yanqing, et al.
Published: (2025)

RULE: Reliable Multimodal RAG for Factuality in Medical Vision Language Models
by: Xia, Peng, et al.
Published: (2024)

ClawForge: Generating Executable Interactive Benchmarks for Command-Line Agents
by: Lai, Yuxiang, et al.
Published: (2026)

A Unified and Controllable Framework for Layered Image Generation with Visual Effects
by: Yang, Jinrui, et al.
Published: (2026)

GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset
by: Wang, Yuhan, et al.
Published: (2025)

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
by: Zhang, Letian, et al.
Published: (2026)

What If We Recaption Billions of Web Images with LLaMA-3?
by: Li, Xianhang, et al.
Published: (2024)

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
by: Chen, Hardy, et al.
Published: (2025)

Autoregressive Pretraining with Mamba in Vision
by: Ren, Sucheng, et al.
Published: (2024)

VLAA-GUI: Knowing When to Stop, Recover, and Search, A Modular Framework for GUI Automation
by: Han, Qijun, et al.
Published: (2026)

A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
by: Xiao, Junfei, et al.
Published: (2023)

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
by: Liu, Jiaqi, et al.
Published: (2026)

S$^{2}$FT: Efficient, Scalable and Generalizable LLM Fine-tuning by Structured Sparsity
by: Yang, Xinyu, et al.
Published: (2024)

Where on Earth? A Vision-Language Benchmark for Probing Model Geolocation Skills Across Scales
by: Qian, Zhaofang, et al.
Published: (2025)

Uniqueness of Positive Solutions for Fractional Schrödinger Equations with General Nonlinearities
by: Li, Xinyu, et al.
Published: (2024)

AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation
by: Wang, Zijun, et al.
Published: (2024)

MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution
by: Chen, Jianwen, et al.
Published: (2026)

Near-Optimal Second-Order Guarantees for Model-Based Adversarial Imitation Learning
by: Li, Shangzhe, et al.
Published: (2025)

From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
by: Wu, Juncheng, et al.
Published: (2026)

Variational Matrix-Learning Fourier Networks for Parametric Multiphysics Surrogates
by: Li, Xinyu, et al.
Published: (2026)

Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models
by: Zhang, Binchi, et al.
Published: (2026)

Knowledge or Reasoning? A Close Look at How LLMs Think Across Domains
by: Wu, Juncheng, et al.
Published: (2025)

Alignment Tipping Process: How Self-Evolution Pushes LLM Agents Off the Rails
by: Han, Siwei, et al.
Published: (2025)

Kinetics of nonisothermal crystallization of SiO 2 –TiO 2 –CaO–BaO–Al 2 O 3 ‐based fluorine‐free mold flux for high‐Ti steel
by: Jiajing Zhang, et al.
Published: (2025)

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
by: Ji, Haonian, et al.
Published: (2025)

Provably Efficient Offline-to-Online Value Adaptation with General Function Approximation
by: Li, Shangzhe, et al.
Published: (2026)

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
by: Han, Siwei, et al.
Published: (2025)

Personalizing black-box models for nonparametric regression with minimax optimality
by: Li, Sai, et al.
Published: (2026)

FAIRM: Learning invariant representations for algorithmic fairness and domain generalization with minimax optimality
by: Li, Sai, et al.
Published: (2024)

Research on reinforcement learning based warehouse robot navigation algorithm in complex warehouse layout
by: Li, Keqin, et al.
Published: (2024)

Time – The fourth dimension of immune cells
by: Guiming Li, et al.
Published: (2024)

CellTypeAgent: Trustworthy cell type annotation with Large Language Models
by: Chen, Jiawen, et al.
Published: (2025)

The Devil is in the Few Shots: Iterative Visual Knowledge Completion for Few-shot Learning
by: Li, Yaohui, et al.
Published: (2024)

Pigeonhole Stochastic Gradient Langevin Dynamics for Large Crossed Mixed Effects Models
by: Zhang, Xinyu, et al.
Published: (2022)