:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Pan, Guanzhong, Chodnekar, Vishal, Roy, Abinas, Wang, Haibo
Format:	Preprint
Publié:	2025
Sujets:	Artificial Intelligence Machine Learning
Accès en ligne:	https://arxiv.org/abs/2509.18101
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

A Middle Path for On-Premises LLM Deployment: Preserving Privacy Without Sacrificing Model Confidentiality
par: Huang, Hanbo, et autres
Publié: (2024)

VSPrefill: Vertical-Slash Sparse Attention with Lightweight Indexing for Long-Context Prefilling
par: Guanzhong, Chen
Publié: (2026)

Multimodal Survival Analysis with Locally Deployable Large Language Models
par: Gögl, Moritz, et autres
Publié: (2026)

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
par: Wu, Duo, et autres
Publié: (2024)

AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
par: Fu, Yonggan, et autres
Publié: (2024)

Comparing Specialised Small and General Large Language Models on Text Classification: 100 Labelled Samples to Achieve Break-Even Performance
par: Pecher, Branislav, et autres
Publié: (2024)

SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
par: Song, Yixin, et autres
Publié: (2025)

Do Large Language Models Reason Causally Like Us? Even Better?
par: Dettki, Hanna M., et autres
Publié: (2025)

Deploying Open-Source Large Language Models: A performance Analysis
par: Bendi-Ouis, Yannis, et autres
Publié: (2024)

Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones
par: Mofakhami, Mehrnaz, et autres
Publié: (2024)

Provable Benefits of In-Tool Learning for Large Language Models
par: Houliston, Sam, et autres
Publié: (2025)

Are Large-Language Models Graph Algorithmic Reasoners?
par: Taylor, Alexander K, et autres
Publié: (2024)

On-Premise SLMs vs. Commercial LLMs: Prompt Engineering and Incident Classification in SOCs and CSIRTs
par: Almeida, Gefté, et autres
Publié: (2025)

FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
par: Zhang, Yi, et autres
Publié: (2024)

Breaking Model Lock-in: Cost-Efficient Zero-Shot LLM Routing via a Universal Latent Space
par: Yan, Cheng, et autres
Publié: (2026)

On-Demand Multi-Task Sparsity for Efficient Large-Model Deployment on Edge Devices
par: Huang, Lianming, et autres
Publié: (2025)

Modular Machine Learning: An Indispensable Path towards New-Generation Large Language Models
par: Wang, Xin, et autres
Publié: (2025)

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models
par: Jin, Ming, et autres
Publié: (2023)

Steering Without Breaking: Mechanistically Informed Interventions for Discrete Diffusion Language Models
par: Zhou, Hanhan, et autres
Publié: (2026)

CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment
par: Guo, Siyuan, et autres
Publié: (2026)

ML Compass: Navigating Capability, Cost, and Compliance Trade-offs in AI Model Deployment
par: Digalakis Jr, Vassilis, et autres
Publié: (2025)

Premise Selection for a Lean Hammer
par: Zhu, Thomas, et autres
Publié: (2025)

Breaking the Factorization Barrier in Diffusion Language Models
par: Li, Ian, et autres
Publié: (2026)

SLMQuant:Benchmarking Small Language Model Quantization for Practical Deployment
par: Wang, Jiacheng, et autres
Publié: (2025)

CONSTRUCTA: Automating Commercial Construction Schedules in Fabrication Facilities with Large Language Models
par: Zhang, Yifan, et autres
Publié: (2025)

Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production
par: Irugalbandara, Chandra, et autres
Publié: (2023)

Large Language Models for Controllable Multi-property Multi-objective Molecule Optimization
par: Dey, Vishal, et autres
Publié: (2025)

SLOT: Structuring the Output of Large Language Models
par: Wang, Darren Yow-Bang, et autres
Publié: (2025)

Private LLM Inference on Consumer Blackwell GPUs: A Practical Guide for Cost-Effective Local Deployment in SMEs
par: Knoop, Jonathan, et autres
Publié: (2026)

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
par: Fan, Chenrui, et autres
Publié: (2025)

Advancing Model Refinement: Muon-Optimized Distillation and Quantization for LLM Deployment
par: Sander, Jacob, et autres
Publié: (2026)

Theoretical Benefit and Limitation of Diffusion Language Model
par: Feng, Guhao, et autres
Publié: (2025)

Budget-Constrained Agentic Large Language Models: Intention-Based Planning for Costly Tool Use
par: Liu, Hanbing, et autres
Publié: (2026)

Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints
par: Markovic-Voronov, Jelena, et autres
Publié: (2026)

Renewable Energy Prediction: A Comparative Study of Deep Learning Models for Complex Dataset Analysis
par: Wang, Haibo, et autres
Publié: (2025)

Beyond the Black Box: A Statistical Model for LLM Reasoning and Inference
par: Dalal, Siddhartha, et autres
Publié: (2024)

$\text{M}^{2}$LLM: Multi-view Molecular Representation Learning with Large Language Models
par: Ju, Jiaxin, et autres
Publié: (2025)

EEGAgent: A Unified Framework for Automated EEG Analysis Using Large Language Models
par: Zhao, Sha, et autres
Publié: (2025)

Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models
par: Naderi, Nariman, et autres
Publié: (2025)

Position: What Can Large Language Models Tell Us about Time Series Analysis
par: Jin, Ming, et autres
Publié: (2024)