Question 1

What target utilization should I use?

Accepted Answer

A target of 50–60% is common for latency-sensitive services, leaving headroom for spikes. CPU-bound batch processing can use 70–80%. Lower targets scale more aggressively (more pods, higher cost). Higher targets risk under-provisioning during spikes.

Question 2

How does HPA handle scale-down?

Accepted Answer

HPA has a default stabilization window of 5 minutes for scale-down (configurable). It won't scale down if any metric calculation in the window suggested more replicas. This prevents flapping. Scale-up has no default window for fast response.

Question 3

Can HPA scale to zero?

Accepted Answer

Standard HPA cannot scale to zero (min is 1). Kubernetes KEDA (Event Driven Autoscaler) supports scale-to-zero based on queue length, HTTP requests, or custom metrics. This is useful for batch workloads and cost optimization.

Question 4

What if my app uses more memory than CPU?

Accepted Answer

Configure HPA to scale on memory utilization instead of or in addition to CPU. Use the metrics API to define multiple scaling criteria. HPA will use the metric that suggests the highest replica count.

Question 5

How do resource requests affect HPA?

Accepted Answer

HPA calculates utilization relative to resource requests. If a pod requests 100m CPU and uses 80m, that's 80% utilization. Setting requests too high makes utilization appear low (under-scaling). Setting too low makes it appear high (over-scaling).

Question 6

What is the difference between HPA and VPA?

Accepted Answer

HPA scales horizontally (more pods). VPA scales vertically (more resources per pod). HPA is better for stateless workloads. VPA is better for single-instance stateful workloads. They can be used together with care, but don't use both for CPU.

Pod Autoscaling Calculator

About the Pod Autoscaling Calculator

Why Use This Pod Autoscaling Calculator?

How to Use This Calculator

Formula

Example Calculation

Tips & Best Practices

The HPA Algorithm

Choosing Min and Max Replicas

Custom Metrics for Smarter Scaling

Frequently Asked Questions