Emerging threat: Web-based indirect prompt injection attacks detected in the wild

Mar 5, 2026 | Threat Intelligence Research

Indirect Prompt Injection: An Emerging Threat to AI Systems

TL;DR

Researchers at Palo Alto Networks have observed a rise in indirect prompt injection (IDPI) attacks targeting large language models (LLMs) embedded in web-facing systems. These attacks use a range of techniques to manipulate AI outputs, leading to potential financial loss and data breaches.

Main Analysis

Indirect prompt injection (IDPI) poses a significant challenge to AI systems as it exploits hidden instructions within benign-looking web content, which LLMs then interpret as commands. This threat capitalizes on the rapid adoption of AI technologies in everyday applications, such as web browsers and customer service tools. As more organizations integrate these models, they become prime targets for adversaries seeking to manipulate results or execute unauthorized actions.

Palo Alto Networks’ telemetry indicates a shift from theoretical risks to real-world exploitation of IDPI. The researchers documented attacks involving data destruction, SEO manipulation for phishing, and unauthorized transactions. An example includes the first identified case where malicious actors evaded an AI-based ad review system, using sophisticated methods to embed prompts designed to deceive LLMs into approving fraudulent content. Such developments illustrate the evolution of attacker strategies as they adopt more complex payloads to achieve harmful objectives.

Furthermore, the study reveals a diverse range of delivery and jailbreak techniques used in IDPI attacks. These include visual concealment methods, such as zero font sizing or off-screen positioning, and social engineering tricks that exploit the LLM’s understanding of context. The taxonomy established categorizes attacker intents into low, medium, high, and critical severity levels based on the potential impact. This stratification aid in prioritizing defense mechanisms based on the nature of the threats.

Defensive Context

Organizations leveraging LLMs need to be acutely aware of the potential for IDPI attacks, particularly those that process untrusted web content during operations like summarization or decision-making. High-value environments, such as e-commerce platforms and online financial services, are particularly vulnerable due to their reliance on user-generated content and external data. Organizations that do not heavily engage with AI or LLMs in customer-facing applications may be less affected.

Why This Matters

IDPI attacks present real-world risks to sectors heavily reliant on AI, particularly e-commerce, finance, and online advertising. Entities in these domains should be prepared for increasingly sophisticated manipulations that can result in both financial loss and reputational damage.

Defender Considerations

Defenders must enhance their detection capabilities to differentiate benign from malicious prompts effectively. Regular monitoring of user-facing AI systems, along with the implementation of behavior analysis to identify potential anomalies caused by IDPI, could be crucial. Additionally, organizations should implement methods that track and analyze prompt visibility and context to mitigate the risk of these attacks.

Indicators of Compromise (IOCs)

Websites:
- 1winofficialsite[.]in
- cblanke2.pages[.]dev
- dylansparks[.]com
- myshantispa[.]com
- reviewerpress[.]com
Payment URLs:
- buy.stripe[.]com/7sY4gsbMKdZwfx39Sq0oM00

These indicators represent sites associated with observed IDPI activity and may serve as starting points for investigation or monitoring.

Click here for the full article

← Leveraging MDR to enhance cybersecurity resilience in educational institutions