Remote MCP Server in Kubernetes Step by Step guide ⚙


Hello Reader,

Most engineers are using MCP clients and agents. But very few know how to build and host an MCP server, let alone run it remotely on the cloud. In today's edition, we will learn how to create and run a remote MCP server on Kubernetes, on Amazon EKS! I will share the code repo as well, so you can try this out yourself. But first..

🔧 What is an MCP Server really?

It’s not just an API that performs a task.

An MCP Server is a protocol-compliant endpoint (defined by Anthropic) that allows any MCP client to:

  • Discover what tools are available
  • Understand what arguments the tools accept
  • Call those tools with dynamic inputs
  • Handle chunked responses over Streamable HTTP

Everything flows through a /mcp endpoint with JSON-RPC 2.0 methods like:

  • tools/list — return available tools with descriptions, args, and metadata
  • tools/call — invoke a tool with arguments and return result

Because it’s standardized, any agent or MCP client can call it without custom wiring. Sample methods and specifications below:

🧱 High Level Flow for Remote MCP Server

  1. You can either code the methods for MCP such as tools/list, tools/call etc. yourself, which can be tedious. Or use MCP implementation with library like FastMCP. FastMCP can abstract all the MCP methods, and can write it for you using the details from the program.
  2. Run Dockerfile to create a container and save it in ECR
  3. Deploy to the Kubernetes cluster such as in Amazon EKS, and expose it using a ALB via service or ingress. ALB and container support streamable HTTP out of the box, and works nicely
  4. Invoke your MCP Server using the ALB Url. For Streamable HTTP, you need to initialize a session, and then use the sessionID for subsequent calls to do tool discovery, and tool calls

Detailed walkthrough of code with demo:

video preview

Code repo: https://github.com/saha-rajdeep/Remote-MCP-Server

🧱 Local vs Remote: Big Difference

Should you run MCP server on your laptop or remotely? Below are the considerations

  • Remote implementation is scalable. For example, in this example, ALB can scale, and pods running the MCP server container can be scaled using HPA, and Karpenter. Whereas you can't really scale your local MCP server
  • Remote MCP server can be used my many clients because it's exposed by a Load Balancer
  • You can implement AuthN/Z with remote MCP server. For example, in this example, you can integrate Cognito with the ALB
  • This example is implemented using Amazon EKS. But the same methodology applies, and the same server can be run in ECS, Serverless, EC2 etc.
  • Since local MCP Servers don't require traffic to traverse through internet, it can be argued that it's more secure. However for real world projects, almost all MCP servers will be implemented remotely

🧠 Interview Ready

If someone asks you:

“How would you implement and host an MCP server on AWS?”

Don’t just say “run it in Lambda or EKS.”

Explain the MCP spec:

  • /mcp endpoint
  • JSON-RPC format
  • tools/list and tools/call
  • Transport protocols (stdin for local, Streamable HTTP for remote)

Then walk them through your deployment strategy (Docker, ECR, EKS, Load Balancer) and security posture.

That alone can set you apart from 90% of other candidates.

If you have found this newsletter helpful, and want to support me 🙏:

Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links

Keep learning and keep rocking 🚀,

Raj

Fast Track To Cloud

Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top Solutions Architect at AWS.

Read more from Fast Track To Cloud

Hello Reader, On my interactions, this question is coming up a lot - “How are AWS Strands different from Bedrock Agents?”. In today's newsletter, we will go over this, so you can also answer this in your interviews or real-world projects Let’s break it down using a practical example: What happens when a user asks an LLM app: What’s the time in New York? What’s the weather there? List my S3 buckets The LLM don't have these information, hence it needs to invoke tools for time, weather, and AWS...

Hello Reader, Another day, another MCP tool. But this one is special. Today we are going to go over newly released EKS MCP server. This is the official Kubernetes MCP server released and maintained by AWS. This one will rule them all! In today's edition, we are going to go over what it is, why this one is a game changer, how you can use this to get job interviews and demand more money, and whether it will eliminate SRE jobs. There are three Ways to Manage Kubernetes : Traditional Way (Manual...

Hello Reader, We all heard it - Gen AI is taking away your job. The reality is, it is for sure impacting your job functionalities. However, there is a bigger reason why many folks are failing interviews and not growing in their career, due to Gen AI, but NOT for the reasons you think! Let's dive deep You're Learning the Wrong Gen AI Tools ⚙ There are countless Gen AI tools and services out there. But here’s the key question you should be asking: Which tools are actually being used by the...