:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Zhou, Yuhang, Zhang, Mingrui, Li, Ke, Wang, Mingyi, Liu, Qiao, Wang, Qifei, Liu, Jiayi, Liu, Fei, Li, Serena, Li, Weiwei, Gao, Mingze, Kumar, Abhishek, Fan, Xiangjun, Zhao, Zhuokai, Zhang, Lizhu
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computation and Language Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2510.20176
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

LLM-Driven Reasoning for Constraint-Aware Feature Selection in Industrial Systems
di: Zhou, Yuhang, et al.
Pubblicazione: (2026)

OmniOPD: Logit-Free On-Policy Distillation via Speculative Verification
di: Zhou, Yuhang, et al.
Pubblicazione: (2026)

Synthetic Sandbox for Training Machine Learning Engineering Agents
di: Zhou, Yuhang, et al.
Pubblicazione: (2026)

EBPO: Empirical Bayes Shrinkage for Stabilizing Group-Relative Policy Optimization
di: Han, Kevin, et al.
Pubblicazione: (2026)

S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
di: Zeng, Hanqing, et al.
Pubblicazione: (2025)

GEM: Empowering LLM for both Embedding Generation and Language Understanding
di: Zhang, Caojin, et al.
Pubblicazione: (2025)

RecoWorld: Building Simulated Environments for Agentic Recommender Systems
di: Liu, Fei, et al.
Pubblicazione: (2025)

CAFe: Unifying Representation and Generation with Contrastive-Autoregressive Finetuning
di: Yu, Hao, et al.
Pubblicazione: (2025)

Thought Communication in Multiagent Collaboration
di: Zheng, Yujia, et al.
Pubblicazione: (2025)

Exploring System 1 and 2 communication for latent reasoning in LLMs
di: Coda-Forno, Julian, et al.
Pubblicazione: (2025)

Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment
di: Wang, Chaoqi, et al.
Pubblicazione: (2025)

TARo: Token-level Adaptive Routing for LLM Test-time Alignment
di: Rai, Arushi, et al.
Pubblicazione: (2026)

Agentic Recommender System with Hierarchical Belief-State Memory
di: Shen, Xiang, et al.
Pubblicazione: (2026)

Let it Calm: Exploratory Annealed Decoding for Verifiable Reinforcement Learning
di: Yang, Chenghao, et al.
Pubblicazione: (2025)

Token-Level LLM Collaboration via FusionRoute
di: Xiong, Nuoya, et al.
Pubblicazione: (2026)

StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
di: Yang, Yanlai, et al.
Pubblicazione: (2025)

Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation
di: Liu, Mingrui, et al.
Pubblicazione: (2024)

GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification
di: Fostiropoulos, Iordanis, et al.
Pubblicazione: (2026)

Grid Evolution for Doubly Fractional Channel Estimation in OTFS Systems
di: Li, Xiangjun, et al.
Pubblicazione: (2024)

Facet-Aware Multi-Head Mixture-of-Experts Model with Text-Enhanced Pre-training for Sequential Recommendation
di: Liu, Mingrui, et al.
Pubblicazione: (2026)

A Joint Prediction Method of Multi-Agent to Reduce Collision Rate
di: Wang, Mingyi, et al.
Pubblicazione: (2024)

DAG-MoE: From Simple Mixture to Structural Aggregation in Mixture-of-Experts
di: Feng, Jiarui, et al.
Pubblicazione: (2026)

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
di: Zhou, Yuhang, et al.
Pubblicazione: (2025)

InfoPO: Information-Driven Policy Optimization for User-Centric Agents
di: Kong, Fanqi, et al.
Pubblicazione: (2026)

RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning
di: Wang, Zihan, et al.
Pubblicazione: (2025)

Understanding Inverse Reinforcement Learning under Overparameterization: Non-Asymptotic Analysis and Global Optimality
di: Zhang, Ruijia, et al.
Pubblicazione: (2025)

On the Expressive Power of Mixture-of-Experts for Structured Complex Tasks
di: Wang, Mingze, et al.
Pubblicazione: (2025)

RieMind: Geometry-Grounded Spatial Agent for Scene Understanding
di: Ropero, Fernando, et al.
Pubblicazione: (2026)

UrbanMind: Urban Dynamics Prediction with Multifaceted Spatial-Temporal Large Language Models
di: Liu, Yuhang, et al.
Pubblicazione: (2025)

Destroy and Repair Using Hyper Graphs for Routing
di: Li, Ke, et al.
Pubblicazione: (2025)

CoMind: Towards Community-Driven Agents for Machine Learning Engineering
di: Li, Sijie, et al.
Pubblicazione: (2025)

Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration
di: Sui, Songyuan, et al.
Pubblicazione: (2025)

Matched Filtering-Based Channel Estimation for AFDM Systems in Doubly Selective Channels
di: Li, Xiangjun, et al.
Pubblicazione: (2025)

Agentic Reinforcement Learning with Implicit Step Rewards
di: Liu, Xiaoqian, et al.
Pubblicazione: (2025)

An Online Review‐Driven Multiattribute Decision‐Making Model Based on Rough‐Cloud‐Integrated Three‐Way Decisions
di: Fan Jia, et al.
Pubblicazione: (2026)

Understanding Nonlinear Collaboration between Human and AI Agents: A Co-design Framework for Creative Design
di: Zhou, Jiayi, et al.
Pubblicazione: (2024)

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning
di: Liu, Yuhong, et al.
Pubblicazione: (2025)

Functionality Locality, Mixture & Control = Logic = Memory
di: Peng, Xiangjun
Pubblicazione: (2024)

Preliminary analysis on the interdecadal variation characteristics of typhoon over the Northwestern Pacific in the past sixty years.
di: Yu, Fan, et al.
Pubblicazione: (2012)

Preliminary analysis on the interdecadal variation characteristics of typhoon over the Northwestern Pacific in the past sixty years.
di: Yu, Fan, et al.
Pubblicazione: (2012)