Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yan, Cheng, Zhang, Wuyang, Ning, Zhiyuan, Xu, Fan, Tao, Ziyang, Zhang, Lu, Yin, Bing, Zhang, Yanyong
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.06220
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

The rapid proliferation of Large Language Models (LLMs) has led to a fragmented and inefficient ecosystem, a state of ``model lock-in'' where seamlessly integrating novel models remains a significant bottleneck. Current routing frameworks require exhaustive, costly retraining, hindering scalability and adaptability. We introduce ZeroRouter, a new paradigm for LLM routing that breaks this lock-in. Our approach is founded on a universal latent space, a model-agnostic representation of query difficulty that fundamentally decouples the characterization of a query from the profiling of a model. This allows for zero-shot onboarding of new models without full-scale retraining. ZeroRouter features a context-aware predictor that maps queries to this universal space and a dual-mode optimizer that balances accuracy, cost, and latency. Our framework consistently outperforms all baselines, delivering higher accuracy at lower cost and latency.

Similar Items