RL Blog

Topics

All Blog PostsAppSec & Supply Chain SecurityDev & DevSecOpsProducts & TechnologySecurity OperationsThreat Research

Follow us

XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBluesky

Subscribe

Get the best of RL Blog delivered to your in-box weekly. Stay up to date on key trends, analysis and best practices across threat intelligence and software supply chain security.

ReversingLabs: The More Powerful, Cost-Effective Alternative to VirusTotalSee Why
Skip to main content
Contact UsSupportLoginBlogCommunity
reversinglabsReversingLabs: Home
Solutions
Secure Software OnboardingSecure Build & ReleaseProtect Virtual MachinesIntegrate Safe Open SourceGo Beyond the SBOM
Increase Email Threat ResilienceDetect Malware in File Shares & StorageAdvanced Malware Analysis SuiteICAP Enabled Solutions
Scalable File AnalysisHigh-Fidelity Threat IntelligenceCurated Ransomware FeedAutomate Malware Analysis Workflows
Products & Technology
Spectra Assure®Software Supply Chain SecuritySpectra DetectHigh-Speed, High-Volume, Large File AnalysisSpectra AnalyzeIn-Depth Malware Analysis & Hunting for the SOCSpectra IntelligenceAuthoritative Reputation Data & Intelligence
Spectra CoreIntegrations
Industry
Energy & UtilitiesFinanceHealthcareHigh TechPublic Sector
Partners
Become a PartnerValue-Added PartnersTechnology PartnersMarketplacesOEM Partners
Alliances
Resources
BlogContent LibraryCybersecurity GlossaryConversingLabs PodcastEvents & WebinarsLearning with ReversingLabsWeekly Insights Newsletter
Customer StoriesDemo VideosDocumentationOpenSource YARA Rules
Company
About UsLeadershipCareersSeries B Investment
EventsRL at RSAC
Press ReleasesIn the News
Pricing
Software Supply Chain SecurityMalware Analysis and Threat Hunting
Request a demo
Menu
AppSec & Supply Chain SecurityMay 8, 2025

Indirect prompt injection attacks target common LLM data sources

Malicious instructions buried in LLM sources such as documents can poison ML models. Here's how it works — and how to protect your AI systems.

John P. Mello Jr.
John P. Mello Jr., Freelance technology writer.John P. Mello Jr.
FacebookFacebookXX / TwitterLinkedInLinkedInblueskyBlueskyEmail Us
brown glass bottle of poison

While the shortest distance between two points is a straight line, a straight-line attack on a large language model isn't always the most efficient — and least noisy — way to get the LLM to do bad things. That's why malicious actors have been turning to indirect prompt injection attacks on LLMs.

Indirect prompt injection attacks involve malicious instructions embedded within external content — documents, web pages, or emails — that an LLM processes. The model may interpret these instructions as valid user commands, leading to unintended actions such as data leaks or misinformation.

A team of researchers recently wrote that indirect prompt injection attacks are successful because LLMs lack the ability to distinguish between informational context and actionable instructions. In addition, LLMs lack awareness when executing instructions within external content. The research team wrote on arXivLabs about their approach to assessing the attack method, as well as techniques for protecting LLMs:

To address this critical yet under-explored issue, we introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities. Using BIPIA, we evaluate existing LLMs and find them universally vulnerable.

Here's what you need to know about indirect prompt injection attacks — and what you can do to secure your AI systems against them.

Get White Paper: How the Rise of AI Will Impact Software Supply Chain Security

Indirect LLM attacks are challenging to defend against

Indirect prompt injection attacks are powerful because they exploit the LLM’s trust in external sources, including user-generated data, websites, and comments, bypassing any need for direct access to the system prompt or user interface, said Chris Acevedo, a principal consultant with the security firm Optiv.

Unlike traditional prompt injection, where an attacker tries to manipulate AI by feeding it crafted input directly, this technique hides malicious instructions inside content that the model reads, like a poisoned well disguised as clean water. This makes them stealthy and harder to trace, since the injection is hidden in data the LLM is simply reading, not in user input.

Chris Acevedo

Christopher Cullen, a vulnerability researcher in the CERT division of the Software Engineering Institute at Carnegie Mellon University, said indirect prompt injection attacks can be challenging for blue teams because they give a sufficiently positioned and competent attacker the capability to either control an underlying LLM system or prevent expected function of it.

[In] comparison to direct prompt injection, this attacker can be positioned in a way not immediately obvious to a blue team member. This gives that attacker control over the systems from a position that cannot be directly addressed by blue teams without changing the underlying way that their LLM draws data.

Christopher Cullen

Cullen explained that in an enterprise that uses an LLM trained on emails, for example, an attacker could provide enough emails with malicious content in them that they may alter the LLM system. "Blue team members may believe that their system is blocking the malicious emails, but if the LLM is accessing a malicious email to form a response to a user, the attacker can alter the expected behavior of the LLM," he said.

Stephen Kowski, field CTO at SlashNext, said the attacks can bypass security controls since they’re delivered through trusted content channels that the LLM is asked to analyze.

The attack payload activates only when the content is processed by the LLM, making detection particularly challenging without specialized AI security tools that can identify and block manipulated content before it reaches the model.

Stephen Kowski

Greg Anderson, co-founder and CEO of DefectDojo, said indirect prompt injection attacks are especially dangerous because they exploit the very foundation of how LLMs are built, by training on vast, uncurated datasets. "Unlike direct prompt injections, which target the model through cleverly crafted user inputs, indirect prompt injections poison the model’s knowledge base by inserting malicious content into the public data it learns from.

The challenge is that most LLMs prioritize scale, scraping as much data as possible without verifying the trustworthiness of the source. That creates a wide-open surface for manipulation.

Greg Anderson

Anderson cited one attack involving a group of Reddit users who successfully manipulated various LLMs so that they would not recommend their favorite restaurants and thus prevent crowds. "While relatively benign, and potentially even hilarious, this same technique can have devastating consequences on code generation when used to recommend intentionally malicious code," he said.

Understand the threat to software supply chains

Indirect prompt injection attacks pose a significant threat to the software supply chain because LLMs are increasingly integrated into development tools and workflows and so can inject malicious code or configurations into software projects, said Jason Dion, chief product officer and founder of the health care firm Akylade.

If an attacker can compromise the data sources used by an LLM by affecting the source code repositories or the LLM's documentation and training, then this can lead to compromises that might impact countless downstream users and connected systems.

Jason Dion

Erich Kron, a security awareness advocate at KnowBe4, said that with more and more people using AI coding tools, the risk of including potentially vulnerable or malicious code that was learned from malicious sources increases.

If bad actors create a number of GitHub repositories that all include a purposely created vulnerability in the code, and the LLM is told to learn from or use those as code sources, it is very possible that could include that same vulnerability in the code it produces for the LLM user, which may then include it in their product.

Erich Kron

Optiv's Acevedo noted that as more developers rely on LLMs to vet packages, review pull requests, and write code, the content these tools consume becomes an attack vector. A malicious actor could hide an indirect prompt injection in a package’s README or metadata, tricking the model into recommending or installing something unsafe, he said.

There have been demonstrations of package managers like PyPI or npm hosting packages whose documentation contains prompt injection payloads designed to influence AI-assisted tools.

Chris Acevedo

Steps to addressing the threat

The research team's analysis noted:

Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.

Based on these findings, the team proposes two novel defense mechanisms: boundary awareness and explicit reminders. "Extensive experiments demonstrate that our black-box defense provides substantial mitigation, while our white-box defense reduces the attack success rate to near-zero levels, all while preserving the output quality of LLMs," they wrote.

Acevedo said that because indirect prompt injection is happening now, "the more we rely on LLMs to interact with external data, the more doors we’re opening."

These attacks don’t require deep technical skill or zero-day exploits. They rely on something simpler: the model’s willingness to follow whatever text it sees, regardless of where it came from. In a world where AI is reading everything, we need to start asking, 'Who’s writing it?

Chris Aceve

While there is no silver bullet for mitigating indirect prompt injection attacks, Acevedo suggested the following steps to reduce risk in your organization immediately:

  • Sanitize content before it’s fed into an LLM.
  • Tell the model what is input and what is context and instruct it not to follow commands from external data.
  • Tag untrusted sources so models can treat them more cautiously.
  • Restrict what LLMs can do, especially if they’re allowed to take actions such as executing code or writing files.
  • Monitor outputs for weird behavior, and red-team your systems by simulating these attacks regularly.

Keep learning

  • Get up to speed on the state of software security with RL's Software Supply Chain Security Report 2026. Plus: See the the webinar to discussing the findings.
  • Learn why binary analysis is a must-have in the Gartner® CISO Playbook for Commercial Software Supply Chain Security.
  • Take action on securing AI/ML with our report: AI Is the Supply Chain. Plus: See RL's research on nullifAI and watch how RL discovered the novel threat.
  • Get the report: Go Beyond the SBOM. Plus: See the CycloneDX xBOM webinar.

Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

Tags:AppSec & Supply Chain Security

More Blog Posts

Finger on map

LLMmap puts its finger on ML attacks

Researchers show how LLM fingerprinting can be used to automate generation of customized attacks.

Learn More about LLMmap puts its finger on ML attacks
LLMmap puts its finger on ML attacks
Vibeware bad vibes

Vibeware: More than bad vibes for AppSec

Threat actors are leveraging the freewheeling vibe-coding trend to deliver malicious software at scale.

Learn More about Vibeware: More than bad vibes for AppSec
Vibeware: More than bad vibes for AppSec
CRA accelerates advantage

The CRA is coming: Are you ready?

Here's how the EU's Cyber Resilience Act will reshape the software industry — and how that accelerates advantages.

Learn More about The CRA is coming: Are you ready?
The CRA is coming: Are you ready?
AI agents risk

Claude Mythos: Get your AppSec game on

Anthropic's new AI is a 'step change' for exposing software flaws — but also ramps up exploits. Are you ready?

Learn More about Claude Mythos: Get your AppSec game on
Claude Mythos: Get your AppSec game on

Spectra Assure Free Trial

Get your 14-day free trial of Spectra Assure for Software Supply Chain Security

Get Free TrialMore about Spectra Assure Free Trial
Blog
Events
About Us
Webinars
In the News
Careers
Demo Videos
Cybersecurity Glossary
Contact Us
reversinglabsReversingLabs: Home
Privacy PolicyCookiesImpressum
All rights reserved ReversingLabs © 2026
XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBlueskyRSSRSS
Back to Top