Reactive Scaling
3 minute read
Before you begin
You need to have a Kubernetes cluster with Kapacity and Prometheus installed.
Run sample workload
Download nginx-statefulset.yaml and run following command to run an NGINX workload:
kubectl apply -f nginx-statefulset.yaml
Check if the workload is running:
kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-0 1/1 Running 0 5s
Create IHPA with dynamic reactive portrait provider
Download dynamic-reactive-portrait-sample.yaml which looks like this:
apiVersion: autoscaling.kapacitystack.io/v1alpha1
kind: IntelligentHorizontalPodAutoscaler
metadata:
name: dynamic-reactive-portrait-sample
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: nginx
minReplicas: 1
maxReplicas: 10
portraitProviders:
- type: Dynamic
priority: 1
dynamic:
portraitType: Reactive
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 30
algorithm:
type: KubeHPA
Run following command to create the IHPA:
kubectl apply -f dynamic-reactive-portrait-sample.yaml
Increase the load
Run following command to get the ClusterIP and port of the NGINX service:
kubectl get svc nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx ClusterIP 10.111.21.74 <none> 80/TCP 13m
Start a different pod to act as a client which will send requests to the NGINX service infinitely with the service ip and port replaced by the value got in previous step:
# Run this in a separate terminal so that the load generation continues and you can carry on with the rest of the steps
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://<service-ip>:<service-port> > /dev/null; done"
After several minutes, you can see that the workload is scaled up by checking events of the IHPA:
kubectl describe ihpa dynamic-reactive-portrait-sample
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreateReplicaProfile 6m58s ihpa_controller create ReplicaProfile with onlineReplcas: 1, cutoffReplicas: 0, standbyReplicas: 0
Normal UpdateReplicaProfile 3m45s ihpa_controller update ReplicaProfile with onlineReplcas: 1 -> 6, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0
Stop generating load
In the terminal where you created the Pod that runs a busybox
image, terminate the load generation by typing <Ctrl> + C
.
After several minutes, you can see that the workload is scaled down by checking events of the IHPA:
kubectl describe ihpa dynamic-reactive-portrait-sample
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal CreateReplicaProfile 9m58s ihpa_controller create ReplicaProfile with onlineReplcas: 1, cutoffReplicas: 0, standbyReplicas: 0
Normal UpdateReplicaProfile 6m45s ihpa_controller update ReplicaProfile with onlineReplcas: 1 -> 6, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0
Normal UpdateReplicaProfile 3m15s ihpa_controller update ReplicaProfile with onlineReplcas: 6 -> 4, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0
Normal UpdateReplicaProfile 2m45s ihpa_controller update ReplicaProfile with onlineReplcas: 4 -> 1, cutoffReplicas: 0 -> 0, standbyReplicas: 0 -> 0
Cleanup
Run following command to cleanup all the resources:
kubectl delete -f dynamic-reactive-portrait-sample.yaml
kubectl delete -f nginx-statefulset.yaml