Saved in:
| Main Authors: | Bhatti, Amit Singh, Vaddina, Vishal, Birru, Dagnachew |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.19402 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation
by: Singh, Aasheesh, et al.
Published: (2025)
by: Singh, Aasheesh, et al.
Published: (2025)
Responsible AI for General-Purpose Systems: Overview, Challenges, and A Path Forward
by: Patro, Gourab K, et al.
Published: (2026)
by: Patro, Gourab K, et al.
Published: (2026)
Surrogate-Guided Quantum Discovery in Black-Box Landscapes with Latent-Quadratic Interaction Embedding Transformers
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)
A Multi-Objective Genetic Algorithm for Healthcare Workforce Scheduling
by: Patel, Vipul, et al.
Published: (2025)
by: Patel, Vipul, et al.
Published: (2025)
Knowledge Graph Based Repository-Level Code Generation
by: Athale, Mihir, et al.
Published: (2025)
by: Athale, Mihir, et al.
Published: (2025)
Search-Based Risk Feature Discovery in Document Structure Spaces under a Constrained Budget
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026)
Accelerated Gradient-based Design Optimization Via Differentiable Physics-Informed Neural Operator: A Composites Autoclave Processing Case Study
by: Patel, Janak M., et al.
Published: (2025)
by: Patel, Janak M., et al.
Published: (2025)
ACT: Bridging the Gap in Code Translation through Synthetic Data Generation & Adaptive Training
by: Saxena, Shreya, et al.
Published: (2025)
by: Saxena, Shreya, et al.
Published: (2025)
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
by: Sikeridis, Dimitrios, et al.
Published: (2024)
by: Sikeridis, Dimitrios, et al.
Published: (2024)
Efficient Training-Free Online Routing for High-Volume Multi-LLM Serving
by: Wu, Fangzhou, et al.
Published: (2025)
by: Wu, Fangzhou, et al.
Published: (2025)
SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud
by: Yet, Hariz, et al.
Published: (2026)
by: Yet, Hariz, et al.
Published: (2026)
HI-SQL: Optimizing Text-to-SQL Systems through Dynamic Hint Integration
by: Parab, Ganesh, et al.
Published: (2025)
by: Parab, Ganesh, et al.
Published: (2025)
LAPS: A Length-Aware-Prefill LLM Serving System
by: She, Jianshu, et al.
Published: (2026)
by: She, Jianshu, et al.
Published: (2026)
Orla: A Library for Serving LLM-Based Multi-Agent Systems
by: Shahout, Rana, et al.
Published: (2026)
by: Shahout, Rana, et al.
Published: (2026)
SLA2: Sparse-Linear Attention with Learnable Routing and QAT
by: Zhang, Jintao, et al.
Published: (2026)
by: Zhang, Jintao, et al.
Published: (2026)
TinyServe: Query-Aware Cache Selection for Efficient LLM Serving
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
GAR: Carbon-Aware Routing for LLM Inference via Constrained Optimization
by: Sheshanarayana, Disha, et al.
Published: (2026)
by: Sheshanarayana, Disha, et al.
Published: (2026)
SeqRoute: Global Budget-Aware Sequential LLM Routing via Offline Reinforcement Learning
by: Xu, Zhongling, et al.
Published: (2026)
by: Xu, Zhongling, et al.
Published: (2026)
POLAR: Online Learning for LoRA Adapter Caching and Routing in Edge LLM Serving
by: Li, Shaoang, et al.
Published: (2026)
by: Li, Shaoang, et al.
Published: (2026)
StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)
by: Kumar, Satyam, et al.
Published: (2026)
Efficient Mixture-of-Agents Serving via Tree-Structured Routing, Adaptive Pruning, and Dependency-Aware Prefill-Decode Overlap
by: Wang, Zijun, et al.
Published: (2025)
by: Wang, Zijun, et al.
Published: (2025)
LLM-Guided Lifecycle-Aware Clustering of Multi-Turn Customer Support Conversations
by: Pattnayak, Priyaranjan, et al.
Published: (2026)
by: Pattnayak, Priyaranjan, et al.
Published: (2026)
SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent
by: Cao, Shiyi, et al.
Published: (2025)
by: Cao, Shiyi, et al.
Published: (2025)
Constraint-Aware Route Recommendation from Natural Language via Hierarchical LLM Agents
by: Zhe, Tao, et al.
Published: (2025)
by: Zhe, Tao, et al.
Published: (2025)
SOMA: Efficient Multi-turn LLM Serving via Small Language Model
by: Cheng, Xueqi, et al.
Published: (2026)
by: Cheng, Xueqi, et al.
Published: (2026)
SpecMap: Hierarchical LLM Agent for Datasheet-to-Code Traceability Link Recovery in Systems Engineering
by: Nipane, Vedant, et al.
Published: (2026)
by: Nipane, Vedant, et al.
Published: (2026)
RouterBench: A Benchmark for Multi-LLM Routing System
by: Hu, Qitian Jason, et al.
Published: (2024)
by: Hu, Qitian Jason, et al.
Published: (2024)
RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory
by: Liu, Jun, et al.
Published: (2025)
by: Liu, Jun, et al.
Published: (2025)
Towards Socially and Morally Aware RL agent: Reward Design With LLM
by: Wang, Zhaoyue
Published: (2024)
by: Wang, Zhaoyue
Published: (2024)
Cooperative Multi-agent RL with Communication Constraints
by: Xiong, Nuoya, et al.
Published: (2026)
by: Xiong, Nuoya, et al.
Published: (2026)
Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts
by: Garg, Saloni, et al.
Published: (2026)
by: Garg, Saloni, et al.
Published: (2026)
VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?
by: Kamahori, Keisuke, et al.
Published: (2026)
by: Kamahori, Keisuke, et al.
Published: (2026)
MTRouter: Cost-Aware Multi-Turn LLM Routing with History-Model Joint Embeddings
by: Zhang, Yiqun, et al.
Published: (2026)
by: Zhang, Yiqun, et al.
Published: (2026)
Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization
by: Wang, Xudong, et al.
Published: (2026)
by: Wang, Xudong, et al.
Published: (2026)
Chatsparent: An Interactive System for Detecting and Mitigating Cognitive Fatigue in LLMs
by: Marwah, Riju, et al.
Published: (2025)
by: Marwah, Riju, et al.
Published: (2025)
PALS: Power-Aware LLM Serving for Mixture-of-Experts Models
by: Hankendi, Can, et al.
Published: (2026)
by: Hankendi, Can, et al.
Published: (2026)
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents
by: Zhang, Hao, et al.
Published: (2026)
by: Zhang, Hao, et al.
Published: (2026)
LoopServe: An Adaptive Dual-phase LLM Inference Acceleration System for Multi-Turn Dialogues
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
VoltanaLLM: Feedback-Driven Frequency Control and State-Space Routing for Energy-Efficient LLM Serving
by: Yu, Jiahuan, et al.
Published: (2025)
by: Yu, Jiahuan, et al.
Published: (2025)
IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory
by: Song, Wei, et al.
Published: (2025)
by: Song, Wei, et al.
Published: (2025)
Similar Items
-
ORPO-Distill: Mixed-Policy Preference Optimization for Cross-Architecture LLM Distillation
by: Singh, Aasheesh, et al.
Published: (2025) -
Responsible AI for General-Purpose Systems: Overview, Challenges, and A Path Forward
by: Patro, Gourab K, et al.
Published: (2026) -
Surrogate-Guided Quantum Discovery in Black-Box Landscapes with Latent-Quadratic Interaction Embedding Transformers
by: Gopalakrishnan, Saisubramaniam, et al.
Published: (2026) -
A Multi-Objective Genetic Algorithm for Healthcare Workforce Scheduling
by: Patel, Vipul, et al.
Published: (2025) -
Knowledge Graph Based Repository-Level Code Generation
by: Athale, Mihir, et al.
Published: (2025)