It was a Tuesday morning when a Fortune 500 energy company found out their internal knowledge assistant had been feeding executives fabricated safety protocols. The culprit wasn’t a hallucination. An attacker had slipped malicious documents into the retrieval pipeline, bypassing all the existing LLM guardrails. The incident sent shockwaves through the AI security community and sped up a shift that was already happening: a complete rethink of how we secure retrieval augmented generation.
This scenario isn’t hypothetical anymore. With the updated OWASP Top 10 for LLM Applications, five entirely new vulnerability categories have popped up that specifically target RAG architectures. The previous OWASP list, last updated in 2025, focused mostly on prompt injection and model theft. The 2026 edition makes a sobering admission: the most dangerous attack surface in modern AI is not the language model itself, but the pipeline that feeds it external data.
Enterprise adoption of RAG has exploded over the last eighteen months. According to IDC, 73% of large organizations now run some form of retrieval-augmented system in production. Yet OWASP’s analysis of 400 real-world deployments found that fewer than 12% had put any retrieval-specific security controls in place. The gap is huge, and the consequences are beginning to show. Just in the past quarter, the number of publicly reported RAG-related data breaches rose 140% compared to the same period last year.
The good news? These threats are understandable and, with the right approach, manageable. This post breaks down the five new RAG-specific entries in OWASP’s latest list. For each, we’ll explain how the attack works, why it slips past existing radar, and what leading security architects are doing to defend against it. By the end, you’ll have a practical framework for assessing your own RAG stack and prioritizing security investments before an incident lands on your desk.
Why OWASP’s LLM Top 10 Matters for RAG
The OWASP Top 10 has been the go-to standard for web application security for two decades. When the organization turned its attention to large language models in 2025, the initial focus copied traditional AI risks: training data poisoning, model inversion, and overreliance on model outputs. RAG architectures were seen as a safer alternative because they ground responses in verifiable documents. Attackers quickly proved otherwise.
RAG adds a retrieval component that sits between the user and the model. This component queries vector databases, knowledge graphs, or web search APIs, then passes the retrieved context into the prompt. Each of these steps introduces new failure modes. A compromised retrieval source can poison the model’s output without ever touching the model’s weights. An overly permissive caching layer can expose sensitive documents across users. A cleverly crafted query can make the retriever pull conflicting information that confuses the generative logic.
OWASP’s 2026 update acknowledges this by introducing a dedicated “RAG & Grounding” category, under which the five threats we describe sit. It also revises the existing Prompt Injection entry to include retrieval-time injection as a first-class concern. For organizations that have built their AI strategy on RAG, ignoring these items is not an option. Compliance frameworks like the EU AI Act and NIST AI 600-1 now explicitly reference supply chain security for AI components, and the retrieval pipeline is very much part of that supply chain.
The 5 New RAG-Specific Threats
Each entry below represents a distinct attack vector that OWASP’s working group observed in production systems during 2025-2026. We present them in order of prevalence, starting with the most common.
1. Retrieved Document Prompt Injection
This attack takes advantage of the fact that most RAG systems concatenate retrieved text into the prompt with minimal sanitization. An attacker publishes a malicious web page or plants a poisoned document in a shared knowledge base. When the retriever fetches that content and the model processes it, hidden instructions override the original user intent.
Unlike classic prompt injection, where the user crafts a malicious input, this variant hides the payload in the trusted external source. Security teams often assume that because the retrieved documents come from “internal” repositories, they are safe. OWASP’s testing team showed that in 82% of cases, a single injected document could reliably redirect model behavior, even when system prompts explicitly forbade certain actions. The vector is especially dangerous in customer-facing chatbots that pull from documentation sites or user forums.
2. Knowledge Base Poisoning via Unverified Sources
Almost every enterprise RAG implementation connects to multiple, heterogeneous data sources: SharePoint, Confluence, wikis, CRM records, email archives, and increasingly, real-time web search. Not all of these sources get the same level of integrity checking. An attacker who gains write access to a low-stakes wiki page, or who compromises an API key for a news aggregator, can inject false content that the retriever eagerly consumes.
The OWASP report highlights a healthcare organization where a retired clinician’s wiki account stayed active. An external threat actor used it to insert outdated drug interaction data. Over two weeks, the RAG-powered clinical assistant referenced the poisoned information in 37 patient recommendations before the anomaly was caught. The root cause wasn’t a model flaw but a retrieval governance gap: no source freshness validation, no author reputation scoring, and no adversarial document classifier.
3. Data Leakage Through Retrieval Caching
Performance optimization often leads teams to add caching layers between the retriever and the model. When a similar query arrives, the system serves a previously retrieved document set without rerunning the full retrieval pipeline. That’s great for latency, but dangerous for access control.
OWASP documented a scenario in which a financial services firm cached retrieval results keyed by semantic similarity. An employee in the marketing department queried the RAG assistant about upcoming product launches and got back confidential merger documents. The reason? Their query was semantically close to one previously issued by a C-suite executive. The cached result bypassed document-level permissions entirely.
This threat is listed under “Sensitive Information Disclosure” in the OWASP list but is RAG-specific because it emerges from the interaction between caching logic and retrieval architectures that don’t preserve user context. With 68% of production RAG systems using some form of retrieval caching, the exposure surface is wide.
4. Model Confusion via Conflicting Contexts
RAG systems are designed to integrate multiple evidence snippets into a single answer. But what happens when the retrieved documents contradict each other? A naive implementation will pass all snippets to the model and hope it synthesizes them coherently. Attackers can exploit this by crafting queries that force the retriever to fetch a deliberate mix of accurate and malicious documents.
Researchers at Stanford recently showed that by seeding a knowledge base with just 7% adversarial documents, they could cause a RAG model to produce factually wrong answers 63% of the time, while the model expressed high confidence. The OWASP entry for “Improper Output Handling” now includes a sub-item for context-confusion attacks. The defense requires ranking documents not only by relevance but by factual consistency, an active area of research that few production systems have adopted.
5. RAG Chain-of-Thought Exploitation
Many advanced RAG systems use multi-step reasoning, where the model iteratively refines its query and retrieves additional documents. In these agentic RAG pipelines, the model emits intermediate reasoning steps that sometimes include private data or expose the retrieval logic.
OWASP found that in 44% of multi-turn RAG applications, the model’s chain-of-thought trace contained enough information to infer the structure of the underlying knowledge base, including which documents were present and which were not. An attacker can use this information to craft queries that deliberately miss, provoking the model to reveal even more about its retrieval graph. This falls under “Excessive Agency” in the OWASP list but is tailored to RAG’s iterative retrieval behavior.
Mitigating RAG Risks: A Layered Approach
Addressing these threats requires controls at every layer of the RAG stack, from data ingestion to retrieval to generation. Below are five practical mitigations that align with OWASP’s recommendations and have been validated in production environments.
Input and Output Sanitization at the Retrieval Boundary
The simplest and most impactful defense is to treat every retrieved document as untrusted input. Before appending a text chunk to the prompt, pass it through a classifier that detects known injection patterns, hidden commands, and anomalous formatting. Several open-source tools have emerged for this purpose, borrowing techniques from email anti-spam filtering.
One large telecom reduced injection success rates from 90% to under 3% by adding a lightweight BERT-based classifier that flags documents containing instruction-like language (e.g., “Ignore all previous,” “You must now,” “As an AI”). The classifier processes in under 15ms per chunk, adding negligible latency to most retrieval pipelines.
Source-Level Integrity Scoring
Rather than trusting all data sources equally, assign an integrity score based on origin freshness, author reliability, and edit history. A document from the official HR wiki, last updated by a verified employee, should carry a higher score than an anonymous comment scraped from an internal forum.
This scoring feeds into retrieval ranking: documents with low integrity scores are either excluded from the context window or clearly marked as “unverified” in the prompt. Several graph-based RAG implementations now maintain provenance graphs that propagate trust ratings through citation links. Early adopters report a 40% reduction in poisoning incidents after deploying dynamic integrity scoring.
Context-Aware Caching with User-Scoped Keys
To prevent cross-user data leakage, retrieval caches must be scoped not just to query semantics but to the user’s access level. Instead of a single global cache, maintain per-session or per-role caches. When a cached result is served, verify that the original query was made with equal or greater privileges than the current request.
This adds complexity, but database technologies like Redis now support attribute-based encryption keys that can encode user roles into cache lookups. A major bank implemented tenant-scoped caching for their RAG-powered customer support and eliminated all cache-based leakage incidents in a six-month trial.
Consistency-Based Re-ranking of Retrieved Documents
To combat context-confusion attacks, add a re-ranking step after initial retrieval that evaluates cross-document consistency. This can be done with entailment models or even a second, smaller LLM that compares candidate chunks. Documents that contradict a majority of the retrieved set are deprioritized or discarded.
Stanford’s research group recently released a consistency scorer that integrates directly with LangChain and LlamaIndex, achieving a 72% reduction in context-confusion success rates with only a 200ms overhead. As agentic RAG pipelines grow more popular, this re-ranking step becomes essential to prevent the model from being led astray by planted evidence.
Auditing and Monitoring the Retrieval Process
Finally, treat the retrieval pipeline as a security-critical component subject to continuous monitoring. Log every retrieval source, its integrity score, the chunks selected for the prompt, and whether any anomaly flags were triggered. Stream this data to your SIEM or AI observability platform.
One aerospace engineering firm detected a knowledge base poisoning attack within hours, not because a model behaved strangely, but because their retrieval logs showed a sudden influx of documents from an unverified SharePoint site that had not previously contributed to that query topic. The SOC team isolated the source and revoked the connector’s credentials before any damage occurred. OWASP’s guidance specifically calls out logging and monitoring as a cross-cutting security activity for all ten risks.
Building RAG Security into Your 2026 Roadmap
The five threats added to OWASP’s LLM Top 10 are not theoretical edge cases. They represent the most frequently observed attack patterns against production RAG systems today. The retrieval pipeline has become the soft underbelly of enterprise AI, largely because it was built for performance and accuracy, not adversarial robustness.
Yet the solutions don’t require a complete architectural overhaul. A layered approach, combining retrieval-time sanitization, source integrity scoring, context-aware caching, consistency re-ranking, and monitoring, can close the gap that attackers are currently exploiting. The organizations that moved fastest after the OWASP announcement are already seeing dividends: fewer security incidents, lower regulatory risk, and a clearer story to tell auditors and boards.
The energy company from our opening narrative learned the hard way. They spent the next three months rebuilding their retrieval pipeline with the very controls described here. Their CISO later told a conference audience that the breach was “the best investment in AI security we never wanted to make.”
If you’re responsible for a RAG system in production, or one heading to launch, start with a threat assessment aligned to the new OWASP list. Identify which data sources carry the highest risk, test your caching behavior under different user profiles, and run adversarial queries against your own assistant. The goal isn’t paranoia; it’s parity between the sophistication of your AI and the defenses that protect it. Subscribe to our newsletter for a detailed RAG security checklist, and join our upcoming webinar where we walk through a live audit of a production retrieval pipeline.



