Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Cao, Qi, Zhang, Shuhao, Zhou, Ruizhe, Zhang, Ruiyi, Qin, Peijia, Xie, Pengtao
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2601.22323
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908834226765824
author	Cao, Qi Zhang, Shuhao Zhou, Ruizhe Zhang, Ruiyi Qin, Peijia Xie, Pengtao
author_facet	Cao, Qi Zhang, Shuhao Zhou, Ruizhe Zhang, Ruiyi Qin, Peijia Xie, Pengtao
contents	Model routing chooses which language model to use for each query. By sending easy queries to cheaper models and hard queries to stronger ones, it can significantly reduce inference cost while maintaining high accuracy. However, most existing routers treat this as a fixed choice among a small set of models, which makes them hard to adapt to new models or changing budget constraints. In this paper, we propose SCOPE (Scalable and Controllable Outcome Performance Estimator), a routing framework that goes beyond model selection by predicting their cost and performance. Trained with reinforcement learning, SCOPE makes reasoning-based predictions by retrieving how models behave on similar problems, rather than relying on fixed model names, enabling it to work with new, unseen models. Moreover, by explicitly predicting how accurate and how expensive a model will be, it turns routing into a dynamic decision problem, allowing users to easily control the trade-off between accuracy and cost. Experiments show that SCOPE is more than just a cost-saving tool. It flexibly adapts to user needs: it can boost accuracy by up to 25.7% when performance is the priority, or cut costs by up to 95.1% when efficiency matters most.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_22323
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning Cao, Qi Zhang, Shuhao Zhou, Ruizhe Zhang, Ruiyi Qin, Peijia Xie, Pengtao Machine Learning Model routing chooses which language model to use for each query. By sending easy queries to cheaper models and hard queries to stronger ones, it can significantly reduce inference cost while maintaining high accuracy. However, most existing routers treat this as a fixed choice among a small set of models, which makes them hard to adapt to new models or changing budget constraints. In this paper, we propose SCOPE (Scalable and Controllable Outcome Performance Estimator), a routing framework that goes beyond model selection by predicting their cost and performance. Trained with reinforcement learning, SCOPE makes reasoning-based predictions by retrieving how models behave on similar problems, rather than relying on fixed model names, enabling it to work with new, unseen models. Moreover, by explicitly predicting how accurate and how expensive a model will be, it turns routing into a dynamic decision problem, allowing users to easily control the trade-off between accuracy and cost. Experiments show that SCOPE is more than just a cost-saving tool. It flexibly adapts to user needs: it can boost accuracy by up to 25.7% when performance is the priority, or cut costs by up to 95.1% when efficiency matters most.
title	Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning
topic	Machine Learning
url	https://arxiv.org/abs/2601.22323

Similar Items