Beyond the Basics: Optimizing Kubernetes with KEDA’s Autoscaling Magic

10 min readJan 15, 2024

In the ever-evolving landscape of container orchestration and deployment, Kubernetes stands as a powerhouse. However, as applications become more complex, the need for dynamic and efficient scaling mechanisms becomes imperative. This is where Kubernetes Event-Driven Autoscaling (KEDA) steps in, providing a solution that seamlessly adapts to varying workloads and enhances resource utilization. In this article, we will explore the fundamental concepts, benefits, and practical implementation of KEDA in Kubernetes.

Understanding KEDA:

KEDA is an open-source project designed to extend Kubernetes’ native horizontal pod autoscaling capabilities. Unlike traditional autoscaling, which relies on CPU or memory metrics, KEDA introduces the ability to scale based on custom metrics and external events. This enables applications to dynamically adjust their resource allocation in response to specific triggers, leading to improved efficiency and cost-effectiveness.

Key Features of KEDA:

Event-Driven Scaling: KEDA allows scaling based on events such as messages in a queue, records in a database, or any other custom metric. This flexibility enables applications to scale precisely in response to real-time demands.
Support for Various Event Sources: KEDA supports a wide range of event sources, including Azure Queue, RabbitMQ, Kafka, GCP Pub/Sub, AWS SNS, Prometheus and more. This makes it compatible with diverse workloads and systems.
Native Integration with Kubernetes: KEDA seamlessly integrates with Kubernetes, leveraging the Kubernetes Custom Resource Definitions (CRDs) to extend the native scaling capabilities. This ensures that KEDA operates smoothly within Kubernetes environments without introducing complexity.

At a high level, KEDA provides below components in order to control the autoscaling process:

Operator / Agent : Activates and deactivates Kubernetes Deployments to scale to and from zero on no events.
Scalers : Connects to an external event source and feed custom metrics for a specific event source. A current list of scalers is available on the KEDA home page. ( https://keda.sh/#scalers )
Metrics: Acts as a Kubernetes metrics server that exposes rich event data like queue length or stream lag to the Horizontal Pod Autoscaler to drive scale out.

Benefits of Implementing KEDA:

Efficient Resource Utilization: By scaling based on custom metrics and events, KEDA ensures that resources are allocated precisely when needed. This results in optimal resource utilization and cost savings.
Improved Responsiveness: Applications equipped with KEDA can quickly scale in response to changes in workload, ensuring a responsive and reliable user experience. This is particularly crucial for systems with varying levels of demand.
Extensibility: KEDA’s support for various event sources and its extensible architecture make it adaptable to a wide array of use cases. Whether handling streaming data, batch processing, or other event-driven scenarios, KEDA provides a scalable solution.

Practical Implementation:

Implementing KEDA involves defining a ScaledObject, which specifies the scaling rules based on the desired metric and event source. This configuration is then applied to the Kubernetes cluster, enabling KEDA to monitor and scale the associated workload dynamically. So lets start this autoscaling magic.

I assume that you have a kubernetes cluster running in your environment . Here, i am using kubernetes cluster on Azure. You can understand the check the architecture using the below diagram.

In the context of KEDA, the terms “producer” and “consumer” are often used to describe the components that generate events (producers) and those that respond to those events by scaling applications (consumers).

Producer:

In the context of KEDA, a producer is a component or system that generates events. These events are typically external to your application and can include messages arriving in a message queue, new records in a database, or other types of external triggers.
Examples of producers might include a message broker like Azure Service Bus, RabbitMQ, Kafka, or a database system emitting events when new data is added.
The producer is responsible for sending events to the event source, and KEDA integrates with these event sources to understand when events occur.

Consumer:

The consumer, in the context of KEDA, is the component that reacts to the events generated by the producer. In KEDA, this usually involves dynamically adjusting the number of instances (pods) of a Kubernetes Deployment or other scalable workload.
The consumer is the part of your application that KEDA scales based on the incoming events. It could be your application itself or a specific component that handles the event-driven workload.

So, lets start with creating the consumer resources in kubernetes cluster. All the codes you can find in the github repo here.

Consumer deployed as deployment resource. After creating the consumer, it will receive the messages. Now lets start with creating the producer. The producer is also a deployment resource.

A producer service is created to to hit the api to pass the messages. Currently rabbitmq-producer-service is created with LoadBalancer as a service type. It as an external ip address and exposed 80 to 31940. We will see how we can hit the service api while producing the messages latter.

I have not passed the password in the secret resource. You can create a secret resource and refer the password from the secret.

Now let’s start with creating the rabbitmq resources using helm. A helm is a package manager tool that will help you to install the crds and resources easily. You can use operator also to install the crd’s but in this scenario i am using helm. Rabbitmq will be coming from bitnami repo. So we have to add the respective repo using the helm.

# Adding helm repo
helm repo add bitnami https://charts.bitnami.com/bitnami

# Updating helm repo
helm repo update

helm upgrade --install rabbitmq `
>>      --version 10.2.1 `
>>      --set auth.username=user `
>>      --set auth.password=PASSWORD `
      bitnami/rabbitmq

After running the above commands it will deploy two resources. One is rabbitmq deployment and another is it’s respective service.
The service rabbitmq and rabbitmq-headless has been deployed now. The raabitmq-headless service will open the rabbitmq dashboard on port 15672. So if you port forwad the service port 15672 on localhost you will see the rabbitmq dashboard. The username is user and the password is PASSWORD. You can change the username and password while creating the respective resources in the above helm command.

All the setup is ready expect KEDA. Now before creating KEDA resources you can check the processing performance of messages produced by producer and consumed by consumer.

You can use this link to hit the producer service to generate number of messages. Here TeachTalk is a function that Generate methods. You have to add the values to the numberOfMessages.

http://PRODUCER_SERVICE_IP:80/api/TechTalks/Generate?numberOfMessages=100

As soon as you hit this api from the post man UI or using the curl command you will get 200 status code. The producer will be producing the message with the “hello” name. Once the message is produced conumser will be there to consume the message.

env:
    - name: RABBITMQ_BATCH_SIZE
      value: "75"

Here, i am passing the environmental variable RABBITMQ_BATCH_SIZE inside the conumer manifest file. The batch size of the consumer is 75 i.e in one round it will consume 75 messages. So if you are passing 100 message so it will complete in 2 rounds. As the consumer having only one replica so only one pod will do the entire work. But if you increase the messages i.e 1000 so the consumer will take time to consume the messages i.e every batch will be having 75 “hello” messages till it completes to 1000 messages and that will be too processed by only one pod.

Deploying KEDA resources:

KEDA connects to event sources (producers) and scales your application (consumers) dynamically based on the events generated by these sources. The beauty of KEDA is that it allows you to leverage the power of Kubernetes to automatically scale your applications based on external events, providing a flexible and efficient way to handle varying workloads.

Now we will deploy KEDA through helm charts. It will first install the crd’s and then it will be be creating the respective below resources.

As you can see that keda has installed below resources. We will be using scaledjobs and triggerauthentication crds to deploy the respective resources.

Let’s understand what is scaled object contains:

ScaledObject:

A ScaledObject is a Kubernetes Custom Resource that defines the scaling configuration for a specific workload. It tells KEDA which Kubernetes deployment or deployment-like resource to scale and how to scale it based on incoming events.
It includes information about the deployment, the scaling triggers, and the scaling behavior.

Deployment:

The deployment is the Kubernetes resource that represents the workload you want to scale based on events. It could be a Deployment, StatefulSet, Job, etc.

2. Scaler:

A Scaler is the component responsible for interacting with the external event source and translating events into scaling decisions. KEDA supports a variety of scalers for different event sources, such as Azure Service Bus, RabbitMQ, Kafka, etc.
The ScaledObject references a Scaler, indicating the type of event source to monitor.

3. Triggers:

Triggers define the conditions under which the workload should be scaled. For example, a trigger might be set to scale when a certain number of messages arrive in a queue or when specific metrics cross a threshold.
Triggers are configured within the ScaledObject.

Before deploying scaled object we have to deploy the secret resource that cointains below components. The value of the host defines the amqp protocol with user and password.

host: 'amqp://user:PASSWORD@rabbitmq.default.svc.cluster.local:5672/'

apiVersion: v1
kind: Secret
metadata:
  name: keda-rabbitmq-secret
data:
  host: YW1xcDovL3VzZXI6UEFTU1dPUkRAcmFiYml0bXEuZGVmYXVsdC5zdmMuY2x1c3Rlci5sb2NhbDo1NjcyLw==

After deploying secret lets create the trigger authentication resource that will helps store the authenticated trigger.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-rabbitmq-conn
  namespace: default
spec:
  secretTargetRef:
  - parameter: host
    name: keda-rabbitmq-secret
    key: host

Now let’s deploy the main components i.e scaled object. In the scaled object resource we targeting the consumer deployment that will be scaling to process the hello message in queue. Here minimum pod is 0 i.e if there is no message then the pod will be scaled down to zero and then if there is more messages then maximum pod will be 30.

Here, we are using rabbitmq trigger type with amqp protocol and using the authentication as keda-trigger-auth-rabbitmq-conn that we have already deployed above.

---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer-scaled-object
  namespace: default
spec:
  scaleTargetRef:
    name: rabbitmq-consumer-deployment
  pollingInterval: 60 # Optional. Default: 30 seconds
  minReplicaCount: 0 # Optional. Default: 0
  maxReplicaCount: 30 # Optional. Default: 100
  triggers:
    - type: rabbitmq
      metadata:
        queueName: hello. # message in queue
        queueLength: "5"
        protocol: amqp
      authenticationRef:
        name: keda-trigger-auth-rabbitmq-conn

Now all the KEDA components has been deployed. Now if you send the messages then the consumer pod will autoscale and every pod will process the messages.
Here, i am generating 10000 messages using the below api.

http://PRODUCER_SERVICE_IP:80/api/TechTalks/Generate?numberOfMessages=10000

Now lets check to our consumer deployment. As you can see the below image. The consumer deployment is scaling automatically.

You can check the details of the consumer in the rabbitmq dashboard. It wil show you how much consumer is actively processin the messages.

The consumer deployment is scaling due to the message events generated while producing messages. So, this autoscaling is guided by HPA (Horizontal Pod Autoscaler). If you check HPA resources you will be the details of the HPA.

Here you can see that the HPA is created on targets 109334m/5. It has minimum pods as 1 and maximum pods as 30.

In the rabbitmq dashboard you will able to get the details of the messages.

Conclusion:

KEDA brings a new dimension to autoscaling in Kubernetes, enabling a more nuanced and responsive approach to resource management. As applications continue to evolve and diversify, the ability to scale based on custom metrics and events becomes increasingly crucial. By incorporating KEDA into your Kubernetes deployments, you unlock the potential for enhanced efficiency, cost savings, and a more resilient infrastructure. Embrace the power of KEDA and take your scalability to new heights in the world of container orchestration.