Autoscaling Pods in Kubernetes
If you are hosting your workload in a cloud environment, and your traffic pattern is fluctuating in nature (think unpredictable), you need a mechanism to automatically scale out (and off-course scale in) your workload to ensure the service is able to perform as per defined Service Level Objective (SLO), without impacting the User Experience. This semantic is referred to as Autoscaling, to be very precise Horizontal Scaling.
Horizontal Scaling is the construct of adding/removing similar size (think replica) machines depending on demand/certain conditions. In the context of Kubernetes, it would mean — add more Pods or remove existing Pods.
Scaling can be of two types — Vertical and Horizontal. And this blog is focussed on Horizontal scaling.
By the way, there can be other components in the system which might impact the performance and user experience, but the focus of this blog is Compute layer, realised with Kubernetes pods.
Apart from ensuring that the service can handle load, there are couple of indirect benefits that gets added as part of Horizontal Scaling —
- Cost — If you are running in a Cloud environment, one of the key charter is to run the workload in a cost effective manner. To address this, if we can dynamically…