Reactive Scaling

Before you begin

You need to have a Kubernetes cluster with Kapacity and Prometheus installed.

Run sample workload

Download nginx-statefulset.yaml and run following command to run an NGINX workload:

kubectl apply -f nginx-statefulset.yaml

Check if the workload is running:

kubectl get po
NAME READY STATUS RESTARTS AGE nginx-0 1/1 Running 0 5s

Create IHPA with dynamic reactive portrait provider

Download dynamic-reactive-portrait-sample.yaml which looks like this:

apiVersion: autoscaling.kapacitystack.io/v1alpha1 kind: IntelligentHorizontalPodAutoscaler metadata: name: dynamic-reactive-portrait-sample spec: scaleTargetRef: apiVersion: apps/v1 kind: StatefulSet name: nginx minReplicas: 1 maxReplicas: 10 portraitProviders: - type: Dynamic priority: 1 dynamic: portraitType: Reactive metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 30 algorithm: type: KubeHPA

Run following command to create the IHPA:

kubectl apply -f dynamic-reactive-portrait-sample.yaml

Increase the load

Run following command to get the ClusterIP and port of the NGINX service:

kubectl get svc nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx ClusterIP 10.111.21.74 <none> 80/TCP 13m

Start a different pod to act as a client which will send requests to the NGINX service infinitely with the service ip and port replaced by the value got in previous step:

# Run this in a separate terminal so that the load generation continues and you can carry on with the rest of the steps kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://<service-ip>:<service-port> > /dev/null; done"

After several minutes, you can see that the workload is scaled up by checking events of the IHPA:

kubectl describe ihpa dynamic-reactive-portrait-sample
... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal CreateReplicaProfile 6m58s ihpa_controller create ReplicaProfile with onlineReplcas: 1, cutoffReplicas: 0, standbyReplicas: 0 Normal UpdateReplicaProfile 3m45s ihpa_controller update ReplicaProfile with onlineReplcas: 1 -> 6, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0

Stop generating load

In the terminal where you created the Pod that runs a busybox image, terminate the load generation by typing <Ctrl> + C.

After several minutes, you can see that the workload is scaled down by checking events of the IHPA:

kubectl describe ihpa dynamic-reactive-portrait-sample
... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal CreateReplicaProfile 9m58s ihpa_controller create ReplicaProfile with onlineReplcas: 1, cutoffReplicas: 0, standbyReplicas: 0 Normal UpdateReplicaProfile 6m45s ihpa_controller update ReplicaProfile with onlineReplcas: 1 -> 6, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0 Normal UpdateReplicaProfile 3m15s ihpa_controller update ReplicaProfile with onlineReplcas: 6 -> 4, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0 Normal UpdateReplicaProfile 2m45s ihpa_controller update ReplicaProfile with onlineReplcas: 4 -> 1, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0

Cleanup

Run following command to cleanup all the resources:

kubectl delete -f dynamic-reactive-portrait-sample.yaml kubectl delete -f nginx-statefulset.yaml
Last modified October 30, 2023: overall doc tweak (b9c2658)