Will MCP Make RAG Obsolete?


Hello Reader,

MCP is all the rage now. But does this make RAG obsolete? This is becoming a burning question in real-world projects and in interviews. In today's edition, let's take a closer look and find out the answer.

Let's go over what is what quickly.

RAG (Retrieval Augmented Generation)

  1. RAG (Retrieval Augmented Generation) is used where the response can be made better by using company specific context that the LLM does NOT have. You store relevant company data into a vector database. This is done by a process called embeddings where data is transformed into numeric vectors
  2. User gives a prompt which can be made better by adding company specific info
  3. A process (code/jupyter notebook/application) converts the prompt into vector and then search the vector database. Relevant info from the vector database is RETRIEVED (First Part of RAG) and returned
  4. The original prompt is AUGMENTED (Second part of RAG) with this company specific info and sent to LLM
  5. LLM GENERATES (Last part of RAG) the response and sends back to the user

MCP (Model Context Protocol)

MCP standardizes the communication between the agentic code and tools, and datasources. What does this mean?

  • An MCP client (think of a piece of code running inside the agent), connects to the MCP server, instead of connecting to the tool directly with a predefined API URL
  • The developers of each tool expose this MCP server
  • MCP client asks the server, "What can you do?". In response, the MCP server responds with the tool capability, description, and input/output schemas
  • IMPORTANT - this discovery is dynamic, and happening at runtime. If input/output field changes, this discovery call will reveal all the fields at runtime
  • The MCP client registers all these, and then can invoke the tool via the MCP server
  • The MCP server handles the connection to the tool. As a result, the code does NOT need to hardcode the API URLs like before

The Showdown

The obvious question on your mind is, well, if MCP can access the datasource, and the same datasource is being used in RAG, why don't I skip the whole RAG part, and just ask my question to the MCP Host App as shown above? There are a couple of problems with this approach:

  • If you want MCP to connect to the database directly, that means the app team needs to give access to the database to the MCP server.
  • If you have ever worked at an enterprise, you know teams don't want to give access to their actual database. You might think this is territorial (part of it is!), but there are also good reasons for it:
    • If one team is granted access, more teams will be granted access
    • As these teams access the actual database, it eats away at the read and write capacity for the business application to serve customers
    • Database is expensive, and scaling up means more cost
    • App team, managing the database, can't change the schema freely because it'd break the MCP flow

RAG actually solves this problem, how?

  • RAG flow does NOT access the actual database, but the vector database. This way, RAG queries don't consume database capacity
  • The embedding process typically runs nightly during off-hours
  • Embedding into a vector database also indexes the data, ideal for text queries
  • The vector database is not impacted even if the schema is changed in the underlying database

As we can see, RAG also has some strengths. And MCP is already powerful. What now? Turns out there is a middle ground, a best-of-both-worlds solution.

The Solution

Sometimes you do want to query the MCP host because the MCP host has access to many tools, and even other agents! In those cases, if you need to utilize the vector database, MCP has a server that interacts directly with the vector database! For AWS, such MCP server can interact directly with Bedrock Knowledge Base which is basically the vector database.

On the other hand, if you need the RAG flow, you can go the RAG route. RAG route is generally faster because it just queries the vector database. Meanwhile, for the MCP flow, handshaking is involved, as discussed above in the brief MCP overview. This back and forth introduces some latency.

In summary, RAG is going nowhere, and MCP complements RAG! If you get this question in your interview or projects, knock it out of the park!

If you have found this newsletter helpful, and want to support me 🙏:

Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links

AWS SA Bootcamp with Live Classes, Mock Interviews, Hands-On, Resume Improvement and more: https://www.sabootcamp.com/

Keep learning and keep rocking 🚀,

Raj

Fast Track To Cloud

Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top Solutions Architect at AWS.

Read more from Fast Track To Cloud

Hello Reader, Gen AI hype is at an all-time high, and so is the confusion. What do you study, how do you think about it, and where are the most jobs? These are the burning questions in our minds. I love first-principles thinking, which means breaking down a problem into the smallest logical chunks, and I approach Gen AI the same way. Gen AI can be broken down into the following four layers: The bottom layer is the hardware layer, i.e., the silicon chips that can train the models. Example -...

Hello Reader, Becoming an AWS Community Builder is both prestigious and significantly increases your chances of being picked up by recruiters for an interview. In today's newsletter, my mentee Vijay, Principal Engineer at Nokia, will detail the steps so YOU can become an AWS Community Builder as well. Off to you Vijay.. Hi, Vijay here 👋! In March 2025, I got into AWS Community Builders Program. Many people reached out to me to know how to get into the program. Here is a detailed article on...

Hello Reader, In today’s post, let’s look at another correct but average answer and a great answer that gets you hired for common cloud interview questions. And this ties to a larger thread - most candidates fail their Solutions Architect interviews - not because they’re underqualified… But because they don’t know how to communicate like a Solutions Architect. How to stand out as a must-hire? Let's start with a common question, and we will go from there! Question - What's the difference...