:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Kotte, Varun
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2605.18796
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PASC: Pipeline-Aware Conformal Prediction with Joint Coverage Guarantees for Multi-Stage NLP and LLM Pipelines
by: Kotte, Varun
Published: (2026)

PromptPort: A Reliability Layer for Cross-Model Structured Extraction
by: Kotte, Varun
Published: (2026)

Not All Queries Need Rewriting: When Prompt-Only LLM Refinement Helps and Hurts Dense Retrieval
by: Kotte, Varun
Published: (2026)

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing
by: Ding, Dujian, et al.
Published: (2024)

BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute
by: Ding, Dujian, et al.
Published: (2025)

Reasoning Is Not Free: Robust Adaptive Cost-Efficient Routing for LLM-as-a-Judge
by: Zhang, Wenbo, et al.
Published: (2026)

Aligning LLMs with Human Uncertainty: A Beta-Bernoulli Calibrator for LLM Forecasting
by: Dai, Hui, et al.
Published: (2026)

Retrieval Augmented Generation for Domain-specific Question Answering
by: Sharma, Sanat, et al.
Published: (2024)

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization
by: Chuang, Yu-Neng, et al.
Published: (2025)

DTRNet: Dynamic Token Routing Network to Reduce Quadratic Costs in Transformers
by: Sharma, Aman, et al.
Published: (2025)

Beyond the Score: Uncertainty-Calibrated LLMs for Automated Essay Assessment
by: Karim, Ahmed, et al.
Published: (2025)

Cascade Speculative Drafting for Even Faster LLM Inference
by: Chen, Ziyi, et al.
Published: (2023)

LLM Router: Rethinking Routing with Prefill Activations
by: Varshney, Tanay, et al.
Published: (2026)

Universal Model Routing for Efficient LLM Inference
by: Jitkrittum, Wittawat, et al.
Published: (2025)

Task-Aware Calibration: Provably Optimal Decoding in LLMs
by: Tomov, Tim, et al.
Published: (2026)

RouteLLM: Learning to Route LLMs with Preference Data
by: Ong, Isaac, et al.
Published: (2024)

Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing
by: Lai, Kunfeng, et al.
Published: (2025)

Why Semantic Entropy Fails: Geometry-Aware and Calibrated Uncertainty for Policy Optimization
by: Zhang, Zheyuan, et al.
Published: (2026)

Benchmarking Uncertainty Calibration in Large Language Model Long-Form Question Answering
by: Müller, Philip, et al.
Published: (2026)

Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation
by: Ehab, Mohamed, et al.
Published: (2026)

Process Supervision of Confidence Margin for Calibrated LLM Reasoning
by: Wang, Liaoyaqi, et al.
Published: (2026)

Revisiting Uncertainty Estimation and Calibration of Large Language Models
by: Tao, Linwei, et al.
Published: (2025)

Uncertainty in Language Models: Assessment through Rank-Calibration
by: Huang, Xinmeng, et al.
Published: (2024)

On Subjective Uncertainty Quantification and Calibration in Natural Language Generation
by: Wang, Ziyu, et al.
Published: (2024)

Reward-Based Online LLM Routing via NeuralUCB
by: Tsai, Ming-Hua, et al.
Published: (2026)

RouteNLP: Closed-Loop LLM Routing with Conformal Cascading and Distillation Co-Optimization
by: Guo, Dongxin, et al.
Published: (2026)

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
by: Yue, Murong, et al.
Published: (2023)

DartQuant: Efficient Rotational Distribution Calibration for LLM Quantization
by: Shao, Yuantian, et al.
Published: (2025)

TweakLLM: A Routing Architecture for Dynamic Tailoring of Cached Responses
by: Cheema, Muhammad Taha, et al.
Published: (2025)

Continuous Semantic Caching for Low-Cost LLM Serving
by: Atalar, Baran, et al.
Published: (2026)

Rubric-Conditioned LLM Grading: Alignment, Uncertainty, and Robustness
by: Deng, Haotian, et al.
Published: (2025)

Estimating Semantic Alphabet Size for LLM Uncertainty Quantification
by: McCabe, Lucas H., et al.
Published: (2025)

SAUP: Situation Awareness Uncertainty Propagation on LLM Agent
by: Zhao, Qiwei, et al.
Published: (2024)

Dr.LLM: Dynamic Layer Routing in LLMs
by: Heakl, Ahmed, et al.
Published: (2025)

TCUQ: Single-Pass Uncertainty Quantification from Temporal Consistency with Streaming Conformal Calibration for TinyML
by: Lamaakal, Ismail, et al.
Published: (2025)

An Assessment of Human vs. Model Uncertainty in Soft-Label Learning and Calibration
by: Pavlovic, Maja, et al.
Published: (2026)

Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades
by: Bouchard, Dylan
Published: (2026)

Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
by: London, Charles, et al.
Published: (2025)

PsychePass: Calibrating LLM Therapeutic Competence via Trajectory-Anchored Tournaments
by: Chen, Zhuang, et al.
Published: (2026)

HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation
by: Deng, Zewei, et al.
Published: (2026)