:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Qinsi, Liu, Bo, Zhou, Tianyi, Shi, Jing, Lin, Yueqian, Chen, Yiran, Li, Hai Helen, Wan, Kun, Zhao, Wentian
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.25541
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning
by: Lin, Yueqian, et al.
Published: (2025)

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing
by: Liu, Yudong, et al.
Published: (2025)

HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding
by: Lin, Yueqian, et al.
Published: (2025)

FlashFPS: Efficient Farthest Point Sampling for Large-Scale Point Clouds via Pruning and Caching
by: Fu, Yuzhe, et al.
Published: (2026)

Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems
by: Zhang, Hengfan, et al.
Published: (2026)

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
by: Lin, Yueqian, et al.
Published: (2025)

Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach
by: Deng, Shijian, et al.
Published: (2024)

LLaViDA: A Large Language Vision Driving Assistant for Explicit Reasoning and Enhanced Trajectory Planning
by: Liu, Yudong, et al.
Published: (2025)

MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models
by: Wang, Zhongxi, et al.
Published: (2026)

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models
by: Wang, Qinsi, et al.
Published: (2025)

SD-NAE: Generating Natural Adversarial Examples with Stable Diffusion
by: Lin, Yueqian, et al.
Published: (2023)

Latent Bridge: Feature Delta Prediction for Efficient Dual-System Vision-Language-Action Model Inference
by: Liu, Yudong, et al.
Published: (2026)

Angles Don't Lie: Unlocking Training-Efficient RL Through the Model's Own Signals
by: Wang, Qinsi, et al.
Published: (2025)

MARSHAL: Incentivizing Multi-Agent Reasoning via Self-Play with Strategic LLMs
by: Yuan, Huining, et al.
Published: (2025)

Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
by: Wang, Xiyao, et al.
Published: (2024)

Query-Conditioned Evidential Keyframe Sampling for MLLM-Based Long-Form Video Understanding
by: Wang, Yiheng, et al.
Published: (2026)

Self-Improvement as Coherence Optimization: A Theoretical Account
by: Qiu, Tianyi, et al.
Published: (2026)

Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models
by: Wei, Chiyue, et al.
Published: (2025)

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
by: Wu, Peilin, et al.
Published: (2026)

When Does Multi-Agent RL Improve LLM Workflows? Workflow, Scale, and Policy-Sharing Tradeoffs
by: Zeng, Yifan, et al.
Published: (2026)

DyNaVLM: Zero-Shot Vision-Language Navigation System with Dynamic Viewpoints and Self-Refining Graph Memory
by: Ji, Zihe, et al.
Published: (2025)

Self-Supervised Weight Templates for Scalable Vision Model Initialization
by: Xie, Yucheng, et al.
Published: (2026)

SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval
by: Lin, Yueqian, et al.
Published: (2024)

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
by: Liu, Bo, et al.
Published: (2025)

$π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data
by: Zhang, Yaocheng, et al.
Published: (2026)

SPP1 May Play an Important Role in the Carcinoid Nature of PAH
by: Yuxia Huang, et al.
Published: (2025)

KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems
by: Ye, Hancheng, et al.
Published: (2025)

Absolute Zero: Reinforced Self-play Reasoning with Zero Data
by: Zhao, Andrew, et al.
Published: (2025)

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR
by: Zhang, Yulong, et al.
Published: (2025)

Strategic Self-Improvement for Competitive Agents in AI Labour Markets
by: Chiu, Christopher, et al.
Published: (2025)

GenAI at the Edge: Comprehensive Survey on Empowering Edge Devices
by: Navardi, Mozhgan, et al.
Published: (2025)

Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement
by: Yin, Xunjian, et al.
Published: (2024)

Team-Based Self-Play With Dual Adaptive Weighting for Fine-Tuning LLMs
by: Li, Wu, et al.
Published: (2026)

G-Zero: Self-Play for Open-Ended Generation from Zero Data
by: Huang, Chengsong, et al.
Published: (2026)

T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning
by: Wang, Qinsi, et al.
Published: (2026)

StreetScape: Gamified Tactile Interactions for Collaborative Learning and Play
by: Khalaila, Areen, et al.
Published: (2025)

Enhancing Language Agent Strategic Reasoning through Self-Play in Adversarial Games
by: Zhang, Yikai, et al.
Published: (2025)

Iterative Self-Improvement of Vision Language Models for Image Scoring and Self-Explanation
by: Tanji, Naoto, et al.
Published: (2025)

DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
by: Wang, Zhenting, et al.
Published: (2025)

Learn to Think: Improving Multimodal Reasoning through Vision-Aware Self-Improvement Training
by: Zhong, Qihuang, et al.
Published: (2026)