Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Yan, Cheng, Zhang, Wuyang, Ning, Zhiyuan, Xu, Fan, Tao, Ziyang, Zhang, Lu, Yin, Bing, Zhang, Yanyong
Formato:	Preprint
Publicado:	2026
Materias:	Machine Learning Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2601.06220
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866917194241146880
author	Yan, Cheng Zhang, Wuyang Ning, Zhiyuan Xu, Fan Tao, Ziyang Zhang, Lu Yin, Bing Zhang, Yanyong
author_facet	Yan, Cheng Zhang, Wuyang Ning, Zhiyuan Xu, Fan Tao, Ziyang Zhang, Lu Yin, Bing Zhang, Yanyong
contents	The rapid proliferation of Large Language Models (LLMs) has led to a fragmented and inefficient ecosystem, a state of ``model lock-in'' where seamlessly integrating novel models remains a significant bottleneck. Current routing frameworks require exhaustive, costly retraining, hindering scalability and adaptability. We introduce ZeroRouter, a new paradigm for LLM routing that breaks this lock-in. Our approach is founded on a universal latent space, a model-agnostic representation of query difficulty that fundamentally decouples the characterization of a query from the profiling of a model. This allows for zero-shot onboarding of new models without full-scale retraining. ZeroRouter features a context-aware predictor that maps queries to this universal space and a dual-mode optimizer that balances accuracy, cost, and latency. Our framework consistently outperforms all baselines, delivering higher accuracy at lower cost and latency.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_06220
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space Yan, Cheng Zhang, Wuyang Ning, Zhiyuan Xu, Fan Tao, Ziyang Zhang, Lu Yin, Bing Zhang, Yanyong Machine Learning Artificial Intelligence The rapid proliferation of Large Language Models (LLMs) has led to a fragmented and inefficient ecosystem, a state of ``model lock-in'' where seamlessly integrating novel models remains a significant bottleneck. Current routing frameworks require exhaustive, costly retraining, hindering scalability and adaptability. We introduce ZeroRouter, a new paradigm for LLM routing that breaks this lock-in. Our approach is founded on a universal latent space, a model-agnostic representation of query difficulty that fundamentally decouples the characterization of a query from the profiling of a model. This allows for zero-shot onboarding of new models without full-scale retraining. ZeroRouter features a context-aware predictor that maps queries to this universal space and a dual-mode optimizer that balances accuracy, cost, and latency. Our framework consistently outperforms all baselines, delivering higher accuracy at lower cost and latency.
title	Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2601.06220

Ejemplares similares