Hello Reader, If the most hyped Technology of the Year award goes to Gen AI this year, it was Platform Engineering last year. Kubernetes is complex enough, and many of us asked why platform team? And more importantly, why do we care? Well, for starters, it's a hot topic in interviews, and there are a lot of platform jobs. As you all know me, I don't believe in idle theory-crafting; my goal is to teach you things that help you succeed in interviews and real jobs. So, let's start from the beginning. In the early daysThe basic container lifecycle is as below: The fundamental container workflow from local machine to cloud is below:
Kubernetes cluster considerationsThere are some considerations when it comes to the Kubernetes cluster:
Autonomy Vs Standardization As a developer, you want the fredom to create whatever AWS resources you want. I was a developer once, and I didn't give any thought to cost or other best practices. After all, I am the developer, and the world shall bow to me! Developers want autonomy On the other hand, organizations need to enforce some standards so that developers can't just provision a Kubernetes cluster with public endpoints, install an add-on they are not supposed to, and DO NOT UPGRADE the multi-tenant cluster without getting approval or testing other tenants. Even though shift-left, i.e., developers doing more, is good, let's be honest—managing a Kubernetes cluster is A LOT and adds a lot to the developer's plate! For these reasons, Platform Teams were born! Enter Platform Team Platform teams take over the responsibility of creating and managing the cluster. With the platform team in the picture, the flow looks like this: Step 1: The developer team requests the Platform team to provision appropriate AWS resources. In this example, we are using Amazon EKS for the application, but this concept can be extended to any other AWS service. This request for AWS resources is typically done via the ticketing system. In addition to the above, platform teams implement standards, such as the platform team creating OPA/Kyverno policies to ensure developers can't deploy non-standard applications. They also coordinate with the application teams and handle upgrades. Platform teams also help with cost optimization, troubleshooting, implementing GitOps, developing CICD strategy, etc. They enable developers to focus on the business needs without worrying about the management of the cluster. It's like reverse DevOps ;) The platform team takes care of the infrastructure (with the guardrails) appropriate for the organization, and the developer team uses that infrastructure to deploy their application. The platform team does the upgrade and maintenance of the infrastructure to reduce the burden on the developer team. Now you know why platform teams are so important and how they were born! If this question comes in your interview, make sure to knock it out of the park! If you have found this newsletter helpful, and want to support me 🙏: Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links AWS SA Bootcamp with Live Classes, Mock Interviews, Hands-On, Resume Improvement, and more (waitlist for next cohort): https://www.sabootcamp.com/ Keep learning and keep rocking 🚀, Raj |
Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top Solutions Architect at AWS.
Hello Reader, In today’s post, let’s look at another correct but average answer and a great answer that gets you hired to common cloud interview questions. Question - What is RTO and RPO? Common mistakes candidate make - they say RPO (Recovery Point Objective) is measured in unit of data, e.g. gigabyte, petabyte etc. Here is the right answer - Both RPO and RTO are measured in time. RTO stands for Recovery Time Objective and is a measure of how quickly after an outage an application must be...
Hello Reader, Most engineers are using MCP clients and agents. But very few know how to build and host an MCP server, let alone run it remotely on the cloud. In today's edition, we will learn how to create and run a remote MCP server on Kubernetes, on Amazon EKS! I will share the code repo as well, so you can try this out yourself. But first.. 🔧 What is an MCP Server really? It’s not just an API that performs a task. An MCP Server is a protocol-compliant endpoint (defined by Anthropic) that...
Hello Reader, On my interactions, this question is coming up a lot - “How are AWS Strands different from Bedrock Agents?”. In today's newsletter, we will go over this, so you can also answer this in your interviews or real-world projects Let’s break it down using a practical example: What happens when a user asks an LLM app: What’s the time in New York? What’s the weather there? List my S3 buckets The LLM don't have these information, hence it needs to invoke tools for time, weather, and AWS...