Remote MCP Server in Kubernetes Step by Step guide ⚙


Hello Reader,

Most engineers are using MCP clients and agents. But very few know how to build and host an MCP server, let alone run it remotely on the cloud. In today's edition, we will learn how to create and run a remote MCP server on Kubernetes, on Amazon EKS! I will share the code repo as well, so you can try this out yourself. But first..

🔧 What is an MCP Server really?

It’s not just an API that performs a task.

An MCP Server is a protocol-compliant endpoint (defined by Anthropic) that allows any MCP client to:

  • Discover what tools are available
  • Understand what arguments the tools accept
  • Call those tools with dynamic inputs
  • Handle chunked responses over Streamable HTTP

Everything flows through a /mcp endpoint with JSON-RPC 2.0 methods like:

  • tools/list — return available tools with descriptions, args, and metadata
  • tools/call — invoke a tool with arguments and return result

Because it’s standardized, any agent or MCP client can call it without custom wiring. Sample methods and specifications below:

🧱 High Level Flow for Remote MCP Server

  1. You can either code the methods for MCP such as tools/list, tools/call etc. yourself, which can be tedious. Or use MCP implementation with library like FastMCP. FastMCP can abstract all the MCP methods, and can write it for you using the details from the program.
  2. Run Dockerfile to create a container and save it in ECR
  3. Deploy to the Kubernetes cluster such as in Amazon EKS, and expose it using a ALB via service or ingress. ALB and container support streamable HTTP out of the box, and works nicely
  4. Invoke your MCP Server using the ALB Url. For Streamable HTTP, you need to initialize a session, and then use the sessionID for subsequent calls to do tool discovery, and tool calls

Detailed walkthrough of code with demo:

video preview

Code repo: https://github.com/saha-rajdeep/Remote-MCP-Server

🧱 Local vs Remote: Big Difference

Should you run MCP server on your laptop or remotely? Below are the considerations

  • Remote implementation is scalable. For example, in this example, ALB can scale, and pods running the MCP server container can be scaled using HPA, and Karpenter. Whereas you can't really scale your local MCP server
  • Remote MCP server can be used my many clients because it's exposed by a Load Balancer
  • You can implement AuthN/Z with remote MCP server. For example, in this example, you can integrate Cognito with the ALB
  • This example is implemented using Amazon EKS. But the same methodology applies, and the same server can be run in ECS, Serverless, EC2 etc.
  • Since local MCP Servers don't require traffic to traverse through internet, it can be argued that it's more secure. However for real world projects, almost all MCP servers will be implemented remotely

🧠 Interview Ready

If someone asks you:

“How would you implement and host an MCP server on AWS?”

Don’t just say “run it in Lambda or EKS.”

Explain the MCP spec:

  • /mcp endpoint
  • JSON-RPC format
  • tools/list and tools/call
  • Transport protocols (stdin for local, Streamable HTTP for remote)

Then walk them through your deployment strategy (Docker, ECR, EKS, Load Balancer) and security posture.

That alone can set you apart from 90% of other candidates.

If you have found this newsletter helpful, and want to support me 🙏:

Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links

Keep learning and keep rocking 🚀,

Raj

Fast Track To Cloud

Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top AWS Solutions Architect.

Read more from Fast Track To Cloud

Hello Reader, Almost every cloud and Gen AI interview right now includes this question. And almost every candidate gets it wrong. Not because they don't know Gen AI. But because they know too many terms and connect none of them. Let's fix that today. Question: What is an AI Agent? Common but average answer - "An agent can perform complex tasks without a prompt." Why is this average? It doesn't explain the superpower of an AI agent. It doesn't show how agents are different from a simple...

Hello Reader, Everyone's building AI agents. If you've been following our newsletters, on MCP, on agent memory, on getting hired, you know that agents are the next evolution. They connect to your tools, they take actions on your behalf, and they're moving from demos into production faster than most organizations are ready for. But the question almost nobody is asking: who is securing the AI itself and how? To answer that, we welcome Adam Bluhm, Principal AI Architect @HiddenLayer (Ex-AWS)....

Hello Reader, Agents are everywhere. But there’s a big difference between using an agent and building one end-to-end. Let's face it - if you tell a recruiter that you played with Claude or ChatGPT, or even created a workflow using n8n, that won't impress them. Because when a company hires you, it expects you to know how to build agent using the infrastructure components. With that in mind, let's turn our attention to how to build an agent. Good Agent Let's take a look at building a good...