:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bhatti, Amit Singh, Vaddina, Vishal, Birru, Dagnachew
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2601.19402
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation
by: Singh, Aasheesh, et al.
Published: (2025)

Responsible AI for General-Purpose Systems: Overview, Challenges, and A Path Forward
by: Patro, Gourab K, et al.
Published: (2026)

Surrogate-Guided Quantum Discovery in Black-Box Landscapes with Latent-Quadratic Interaction Embedding Transformers
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)

A Multi-Objective Genetic Algorithm for Healthcare Workforce Scheduling
by: Patel, Vipul, et al.
Published: (2025)

Knowledge Graph Based Repository-Level Code Generation
by: Athale, Mihir, et al.
Published: (2025)

Search-Based Risk Feature Discovery in Document Structure Spaces under a Constrained Budget
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)

Accelerated Gradient-based Design Optimization Via Differentiable Physics-Informed Neural Operator: A Composites Autoclave Processing Case Study
by: Patel, Janak M., et al.
Published: (2025)

ACT: Bridging the Gap in Code Translation through Synthetic Data Generation & Adaptive Training
by: Saxena, Shreya, et al.
Published: (2025)

PickLLM: Context-Aware RL-Assisted Large Language Model Routing
by: Sikeridis, Dimitrios, et al.
Published: (2024)

Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
by: Wu, Fangzhou, et al.
Published: (2025)

SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud
by: Yet, Hariz, et al.
Published: (2026)

HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration
by: Parab, Ganesh, et al.
Published: (2025)

LAPS: A Length-Aware-Prefill LLM Serving System
by: She, Jianshu, et al.
Published: (2026)

Orla: A Library for Serving LLM-Based Multi-Agent Systems
by: Shahout, Rana, et al.
Published: (2026)

SLA2: Sparse-Linear Attention with Learnable Routing and QAT
by: Zhang, Jintao, et al.
Published: (2026)

TinyServe: Query-Aware Cache Selection for Efficient LLM Serving
by: Liu, Dong, et al.
Published: (2025)

GAR: Carbon-Aware Routing for LLM Inference via Constrained Optimization
by: Sheshanarayana, Disha, et al.
Published: (2026)

SeqRoute: Global Budget-Aware Sequential LLM Routing via Offline Reinforcement Learning
by: Xu, Zhongling, et al.
Published: (2026)

POLAR: Online Learning for LoRA Adapter Caching and Routing in Edge LLM Serving
by: Li, Shaoang, et al.
Published: (2026)

StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)

Efficient Mixture-of-Agents Serving via Tree-Structured Routing, Adaptive Pruning, and Dependency-Aware Prefill-Decode Overlap
by: Wang, Zijun, et al.
Published: (2025)

LLM-Guided Lifecycle-Aware Clustering of Multi-Turn Customer Support Conversations
by: Pattnayak, Priyaranjan, et al.
Published: (2026)

SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent
by: Cao, Shiyi, et al.
Published: (2025)

Constraint-Aware Route Recommendation from Natural Language via Hierarchical LLM Agents
by: Zhe, Tao, et al.
Published: (2025)

SOMA: Efficient Multi-turn LLM Serving via Small Language Model
by: Cheng, Xueqi, et al.
Published: (2026)

SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering
by: Nipane, Vedant, et al.
Published: (2026)

RouterBench: A Benchmark for Multi-LLM Routing System
by: Hu, Qitian Jason, et al.
Published: (2024)

RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
by: Liu, Jun, et al.
Published: (2025)

Towards Socially and Morally Aware RL agent: Reward Design With LLM
by: Wang, Zhaoyue
Published: (2024)

Cooperative Multi-agent RL with Communication Constraints
by: Xiong, Nuoya, et al.
Published: (2026)

Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts
by: Garg, Saloni, et al.
Published: (2026)

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?
by: Kamahori, Keisuke, et al.
Published: (2026)

MTRouter: Cost-Aware Multi-Turn LLM Routing with History-Model Joint Embeddings
by: Zhang, Yiqun, et al.
Published: (2026)

Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization
by: Wang, Xudong, et al.
Published: (2026)

Chatsparent: An Interactive System for Detecting and Mitigating Cognitive Fatigue in LLMs
by: Marwah, Riju, et al.
Published: (2025)

PALS: Power-Aware LLM Serving for Mixture-of-Experts Models
by: Hankendi, Can, et al.
Published: (2026)

ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents
by: Zhang, Hao, et al.
Published: (2026)

LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
by: Li, Haoyang, et al.
Published: (2025)

VoltanaLLM: Feedback-Driven Frequency Control and State-Space Routing for Energy-Efficient LLM Serving
by: Yu, Jiahuan, et al.
Published: (2025)

IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory
by: Song, Wei, et al.
Published: (2025)