Hello Reader, MCP is all the rage now. But does this make RAG obsolete? This is becoming a burning question in real-world projects and in interviews. In today's edition, let's take a closer look and find out the answer. Let's go over what is what quickly. RAG (Retrieval Augmented Generation)
MCP (Model Context Protocol)MCP standardizes the communication between the agentic code and tools, and datasources. What does this mean?
The Showdown The obvious question on your mind is, well, if MCP can access the datasource, and the same datasource is being used in RAG, why don't I skip the whole RAG part, and just ask my question to the MCP Host App as shown above? There are a couple of problems with this approach:
RAG actually solves this problem, how?
As we can see, RAG also has some strengths. And MCP is already powerful. What now? Turns out there is a middle ground, a best-of-both-worlds solution. The SolutionSometimes you do want to query the MCP host because the MCP host has access to many tools, and even other agents! In those cases, if you need to utilize the vector database, MCP has a server that interacts directly with the vector database! For AWS, such MCP server can interact directly with Bedrock Knowledge Base which is basically the vector database. On the other hand, if you need the RAG flow, you can go the RAG route. RAG route is generally faster because it just queries the vector database. Meanwhile, for the MCP flow, handshaking is involved, as discussed above in the brief MCP overview. This back and forth introduces some latency. In summary, RAG is going nowhere, and MCP complements RAG! If you get this question in your interview or projects, knock it out of the park! If you have found this newsletter helpful, and want to support me 🙏: Checkout my bestselling courses on AWS, System Design, Kubernetes, DevOps, and more: Max discounted links AWS SA Bootcamp with Live Classes, Mock Interviews, Hands-On, Resume Improvement and more: https://www.sabootcamp.com/
Keep learning and keep rocking 🚀, Raj |
Free Cloud Interview Guide to crush your next interview. Plus, real-world answers for cloud interviews, and system design from a top Solutions Architect at AWS.
Hello Reader, Not all questions are equal in interviews and real-world projects. There are some questions that you simply can't mess up, because these concepts are so fundamental, they are used in almost ALL projects. One such concept is high availability. Surprisingly, I hear wrong answers on this all the time. In this edition, let's go over the common bad answers, a good answer, and then some! Question: What is High Availability? Bad Answers Even if a component fails, application should...
Hello Reader, EDA (Event Driven Architecture) has become increasingly popular in recent times. In this newsletter edition, we will explore what EDA is, what the benefits of EDA are, and then some advanced patterns of EDA, including with Kubernetes! Let's get started: An event-driven architecture decouples the producer and processor. In this example producer (human) invokes an API, and send information in JSON payload. API Gateway puts it into an event store (SQS), and the processor (Lambda)...
Hello Reader, In today’s post, let’s look at another correct but average answer and a great answer that gets you hired to common cloud interview questions. Question - What is RTO and RPO? Common mistakes candidate make - they say RPO (Recovery Point Objective) is measured in unit of data, e.g. gigabyte, petabyte etc. Here is the right answer - Both RPO and RTO are measured in time. RTO stands for Recovery Time Objective and is a measure of how quickly after an outage an application must be...