Saved in:
Bibliographic Details
Main Authors: Dehigama, Dilina, Jesalpura, Shyam, Schall, David, Katsarakis, Antonios, Kogias, Marios, Kumar, Rakesh, Grot, Boris
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.23707
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913156733861888
author Dehigama, Dilina
Jesalpura, Shyam
Schall, David
Katsarakis, Antonios
Kogias, Marios
Kumar, Rakesh
Grot, Boris
author_facet Dehigama, Dilina
Jesalpura, Shyam
Schall, David
Katsarakis, Antonios
Kogias, Marios
Kumar, Rakesh
Grot, Boris
contents Online services strive to maintain application responsiveness even when the traffic is unpredictable and fluctuating. Today's online services are commonly deployed as chains of microservices, each microservice packaged as one or more containers inside virtual machines (VMs). While performant and affordable when the load is steady, VM-based deployments are known to be slow to scale when the load spikes, resulting in degraded performance for end-users of the service. To avoid such performance degradations, service providers can over-provision their deployments; however, such a strategy is costly and inefficient, leaving resources under-utilized for extended periods. To address the challenge of unpredictable load spikes, we propose Flare, a hybrid microservice architecture that combines VMs with serverless computing. Flare utilizes VMs to cost-effectively handle steady workloads and leverages serverless elasticity to absorb traffic spikes. When a spike occurs, Flare detects which specific service(s) are overloaded and shifts the excess load of only those services to serverless, thus minimizing the cost overhead. Flare seamlessly integrates into existing auto-scaling and serverless infrastructure, requiring minimal changes to the control plane and no modifications to the application.
format Preprint
id arxiv_https___arxiv_org_abs_2605_23707
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Flare: Leveraging Serverless Elasticity to Absorb Microservice Load Spikes
Dehigama, Dilina
Jesalpura, Shyam
Schall, David
Katsarakis, Antonios
Kogias, Marios
Kumar, Rakesh
Grot, Boris
Distributed, Parallel, and Cluster Computing
Online services strive to maintain application responsiveness even when the traffic is unpredictable and fluctuating. Today's online services are commonly deployed as chains of microservices, each microservice packaged as one or more containers inside virtual machines (VMs). While performant and affordable when the load is steady, VM-based deployments are known to be slow to scale when the load spikes, resulting in degraded performance for end-users of the service. To avoid such performance degradations, service providers can over-provision their deployments; however, such a strategy is costly and inefficient, leaving resources under-utilized for extended periods. To address the challenge of unpredictable load spikes, we propose Flare, a hybrid microservice architecture that combines VMs with serverless computing. Flare utilizes VMs to cost-effectively handle steady workloads and leverages serverless elasticity to absorb traffic spikes. When a spike occurs, Flare detects which specific service(s) are overloaded and shifts the excess load of only those services to serverless, thus minimizing the cost overhead. Flare seamlessly integrates into existing auto-scaling and serverless infrastructure, requiring minimal changes to the control plane and no modifications to the application.
title Flare: Leveraging Serverless Elasticity to Absorb Microservice Load Spikes
topic Distributed, Parallel, and Cluster Computing
url https://arxiv.org/abs/2605.23707