Event Driven Architecture (EDA) Crash Course


Hello Reader,

EDA (Event Driven Architecture) has become increasingly popular in recent times. In this newsletter edition, we will explore what EDA is, what the benefits of EDA are, and then some advanced patterns of EDA, including with Kubernetes! Let's get started:

An event-driven architecture decouples the producer and processor. In this example producer (human) invokes an API, and send information in JSON payload. API Gateway puts it into an event store (SQS), and the processor (Lambda) picks it up and processes it. Note that, the API gateway and Lambda can scale (and managed/deployed) independently

Benefits of an event-driven architecture:

  1. Scale and fail independently - By decoupling your services, they are only aware of the event router, not each other. This means that your services are interoperable, but if one service has a failure, the rest will keep running. The event router acts as an elastic buffer that will accommodate surges in workloads.
  2. Develop with agility - The router also removes the need for heavy coordination between producer and consumer services, speeding up your development process.
  3. Cost efficient - With EDA, you don't need to provision and accommodate for highest level of scale for each component, because consumers can process events from the event router at it's own pace, or at the pace the backend database is comfortable with. For that reason, EDAs are generally cheaper than their synchronous counterparts

Now that we understand the EDA basics, let's take a look at a couple of advanced EDA patterns:

API Gateway + EventBridge Pattern

  1. API Gateway directly posts a message into an EventBridge event bus. Event buses are routers that receive events and delivers them to one or more targets. Event buses are well-suited for routing events from many sources to many targets, based on rules, from delivery to target.
  2. One example could be an insurance company that deals with auto, home, and boat insurance. The API is invoked by the user, which puts a message on the event bus. One field in the message determines if the message is for auto, or home, or boat insurance. EventBridge can have the rules based on the value of this field and then trigger different targets accordingly. For example, it triggers Lambda1 for auto, Step Function for home, and Lambda2 for boat insurance
  3. EventBridge can also archive the messages for testing and replay, and even transform the messages!

SNS + SQS Pattern

This pattern is similar to the previous one but has some differences:

  1. Here, SNS routes messages to different SQS queues. SNS message destination filter, or filter policy, allows subscribers to receive only a subset of messages published to a topic. This can be done using message metadata (attributes attached to the message and NOT the fields in the message), or message fields (only for SNS FIFO topics)
  2. We have separate queues for separate messages. One burning question is, why do I need the queue in between? SNS throughput can be very high and exceed the consumption rate of the processor service (Lambda or EKS in this case) causing in throttling and errors. Introducing a queue ensures that the end processor can consume the messages at a preset rate. Also, as we learned before, with SQS, retries are built in!
  3. Each queue can be processed by different targets - Lambda, EKS, and others.

Finally, let's look at an advanced pattern which is popular with enterprises, combining the superpower of Serverless and Kubernetes

Is EDA (Event Driven Architecture) possible only with Serverless? No, EDA is quite popular with Kubernetes as well. Many customers want to keep their business logic in containers. In today’s edition, we will look at EDA on Kubernetes with SNS and SQS.

Use SNS payload-based filtering to send messages to different queues. Note that we are NOT sending the same message to these two queues like the traditional fanout one-to-many patterns, but each different message (signified by different colored JSON icons) is going to the separate SQS queue.

SQS1 messages are processed by Kubernetes. Use Kubernetes Event Driven Autoscaling (KEDA) with Karpenter to implement event-driven workloads. With KEDA, you can drive the scaling of any container in Kubernetes based on the number of events needing to be processed. One popular implementation is to scale up worker nodes to accommodate pods that process messages coming into a queue:

KEDA monitors queue depth, and scales HPA for the application for processing the messages HPA increases number of pods. Assuming, no capacity available to schedule those pods, Karpenter provisions new nodes. Kube-scheduler places those pods on the VMs. The pods process the messages from queue.

Once processing is done, number of pods go to zero. Karpenter can scale down VMs to zero for maximum cost efficiency

SQS 2 messages are processed by Lambda and DynamoDB

Now, you are equipped to handle even tough interview questions on EDA! Make sure to practice, and crush your cloud interview 🙌

If you have found this newsletter helpful, and want to support me 🙏:

Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links

Keep learning and keep rocking 🚀,

Raj

Fast Track To Cloud

Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top Solutions Architect at AWS.

Read more from Fast Track To Cloud

Hello Reader, In today’s post, let’s look at another correct but average answer and a great answer that gets you hired to common cloud interview questions. Question - What is RTO and RPO? Common mistakes candidate make - they say RPO (Recovery Point Objective) is measured in unit of data, e.g. gigabyte, petabyte etc. Here is the right answer - Both RPO and RTO are measured in time. RTO stands for Recovery Time Objective and is a measure of how quickly after an outage an application must be...

Hello Reader, Most engineers are using MCP clients and agents. But very few know how to build and host an MCP server, let alone run it remotely on the cloud. In today's edition, we will learn how to create and run a remote MCP server on Kubernetes, on Amazon EKS! I will share the code repo as well, so you can try this out yourself. But first.. 🔧 What is an MCP Server really? It’s not just an API that performs a task. An MCP Server is a protocol-compliant endpoint (defined by Anthropic) that...

Hello Reader, On my interactions, this question is coming up a lot - “How are AWS Strands different from Bedrock Agents?”. In today's newsletter, we will go over this, so you can also answer this in your interviews or real-world projects Let’s break it down using a practical example: What happens when a user asks an LLM app: What’s the time in New York? What’s the weather there? List my S3 buckets The LLM don't have these information, hence it needs to invoke tools for time, weather, and AWS...